Welcome to Python-Ciur

Ciur

Ciur is a scrapper layer in development

Ciur is a lib because it has less black magic than a framework

It exports all scrapper related code into separate layer.

If you are annoyed by Spaghetti code, SQL inside PHP or inline CSS inside HTML THEN you also are annoyed by XPATH/CSS selectors code inside crawler.

Ciur gives the taste of Lasagna code generally by enforcing encapsulation for scrapping layer.

It tries to not repeat the bad code.

What does CIUR mean?

Ciur is Romanian for Sieve.

It fulfils the same purpose in the sense of being a device for separating wanted elements from unwanted material.

Python ciur API

>>> import ciur
>>> from ciur.shortcuts import pretty_parse_from_resources
>>> with ciur.open_file("example.org.ciur", __file__) as f:
...    print pretty_parse_from_resources(
...            f,
...            "http://example.org"
...    )
{
     "root": {
         "name": "Example Domain",
         "paragraph": "This domain is established to be used for illustrative examples in documents. You may use this\n    domain in examples without prior coordination or asking for permission."
     }
 }

Samples of usage:

For Developers:

TODO:

Ciur Documentation

If you can’t find the information you’re looking for, have a look at the index or try to find it using the search function: