graphical-webcrawler

The first industrial graphic web crawler web_crawler2.py ================== Prototype idea to test out possbilities

sites.py ================== algorthim for grabing weblinks off wikipedia, and crawling through weblinks

Algorthim description: start at wikipedia mainpage | mappedlist.append(all hrefs on this page entirely) | stagedlist.append(the main page weblink) | double verifiication double checks mappedlist and staged list for any duplicates and matches. | matches will never be visited again and duplicates are made sure to be removed. | Rise and repeat to infinity.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
sites.py		sites.py
web_crawler2.py		web_crawler2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

graphical-webcrawler

About

Uh oh!

Releases

Packages

Languages

trickyspeed/graphical-webcrawler

Folders and files

Latest commit

History

Repository files navigation

graphical-webcrawler

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages