A System for Gathering Data From Classifieds Websites

The program for data gathering from classified websites. Web crawlers imitate the actions of the website user and collect the required information. In addition to text data, robots also recognize information from images: locations, phone numbers, etc.

SOLUTION

We have implemented autotests to check the functionality of the websites. Collecting information on one resource takes 3-6 days. Therefore, before running the tests, you need to check whether the functionality or location of the blocks has changed so that the robots didn’t get lost.

TECHNOLOGIES

Development: Scrapy, Spark, Scala, Java, Python, Tesseract Testing Tools: XPath, Selenium, PyTest, JSON, request

7 months
of development

10 robots
developed
1 000 000
records per day
90% recognition of image data
Other cases
Mobile App for Yugoria Insurance Company
Qiwi {Vsem — a Charity Fundraising Platform
Cifra — a Mobile Accounting App for Entrepreneurs
Sportmaster — Online Store Development
Configuring Jira Products for Paysend
VkusVill — Website Fault Tolerance Improvement
Mobile App for Yugoria Insurance Company
Qiwi {Vsem — a Charity Fundraising Platform
Cifra — a Mobile Accounting App for Entrepreneurs
Sportmaster — Online Store Development
Configuring Jira Products for Paysend
VkusVill — Website Fault Tolerance Improvement
Send us your request
Send us an email or give us a call, we’d love to chat about your most ambitious idea: +1 617 982 1723
Tell us your idea
Send us an email or give us a call, we’d love to chat
about your most ambitious idea: +1 617 982 1723
Оставьте свои контакты
SimbirSoft регулярно расширяет штат сотрудников.
Отправьте контакты, чтобы обсудить условия сотрудничества.
Экспресс-консультация
Заполните все поля формы.
Эксперт свяжется с вами в течение рабочего дня.
File selected
Порекомендуйте друга — получите вознаграждение!