1001 Freelance Projects
Latest Projects from
Freelance Marketplaces
View Project
View this project in detail
(Note: you will be redirected to external marketplace)
Project title:
AI specialist for advanced scraping tool for housing websites
Posted by:
External project from PeoplePerHour
Started:
17-Dec-2024 12:34 GMT
Description:
I am looking for an AI specialist with extensive experience in AI to develop a Windows Service in C# that can do the following:
Every day, visit a list of approximately 800 URLs of real estate agency websites and navigate through the pages to search for newly listed properties added by the agencies.

Next, these property pages must be read, and the relevant data extracted to be stored in a fixed format in tables on an SQL server.

A number of data fields are mandatory, such as:

The direct URL of the property page within the real estate agency's website (to enforce uniqueness)
The city where the property is located
The street where the property is located
The property type, where the choice comes from our fixed list: entire home, apartment, studio, etc. The engine must select the closest match from our list
The number of rooms
The monthly rental price
Whether this price includes or excludes service charges
The date the property is available
The surface area in square meters
A list of URLs of the photos associated with the property
Additionally, there is a list of optional fields we would like to retrieve if the information is available:

Municipality
District
Postal code
House number
Number of bedrooms
Number of bathrooms
Year of construction
Is there a: garden, garage, rooftop terrace, balcony?
Condition of the property
Is the property furnished?
...and so on
A complete list will be provided.

The challenge lies in the fact that each real estate agency uses a different paging method and different page layouts. Furthermore, some agencies include all the information in one block of text, while others display much of the data in columns. This can also change unexpectedly. Therefore, the software must be resilient and capable of understanding how to navigate through the pages to look for new properties.

A second challenge is that some agencies include photos of other nearby properties under the details of a specific property. The tool must recognize that these photos do not belong to the property in question and should ignore them.

Preferably, we would use—due to cost considerations—an AI model that does not rely on a commercial API, unless doing so offers such significant benefits that it is worthwhile.

I would love to hear about your experience and how you would approach this. Specifically: which AI method/engine you would use and the flow of the software.
Project ID:
3413051
Project category:
Project budget:
View this project in detail
(Note: you will be redirected to external marketplace)
Last Projects / Browse Projects
  Project Started
Google Sheets Expert Needed to Optimize Budget Dashboard 01 Apr 2025 15:53 GMT
End to end testing of a Saas 01 Apr 2025 15:51 GMT
Wordpress website 01 Apr 2025 15:51 GMT
Vehicle Branding Vehicle Graphic 01 Apr 2025 14:31 GMT
Sales person for an accountancy practice 01 Apr 2025 14:29 GMT
Google Workspace Issues 01 Apr 2025 14:06 GMT
Front End Designer, HTML CSS Designer, WordPress Design 01 Apr 2025 13:52 GMT
HTML5 Website Optimization & SEO Enhancement 01 Apr 2025 13:30 GMT
Review of Agreement 01 Apr 2025 13:11 GMT
Instagram Marketing 01 Apr 2025 13:10 GMT
Cashflow Forecast 01 Apr 2025 13:10 GMT
WordPress Designer & Developer (Only UK, EU & US Based) 01 Apr 2025 13:05 GMT
Marketing infographic design 01 Apr 2025 13:04 GMT
Comprehensive Website Redesign for Crucible of Faith Ministries 01 Apr 2025 12:49 GMT
freelancer for app publishing 01 Apr 2025 12:39 GMT
Browse All Projects
Projects by Skills ...
android
ajax
asp
aspnet
cms
cpp
csharp
css
delphi
design
drupal
excel
facebook
flash
html
java
javascript
joomla
iphone
mysql
photoshop
php
python
ruby
seo
sql
sysadm
translate
typing
twitter
vbnet
xml
wordpress
writing
New!
Проекты на русском
(Projects in Russian)

Copyright © 2005-2024
1001 Freelance Projects