Search engine for cars taking actually important factors into consideration

TechLabs Düsseldorf
4 min readOct 3, 2022

--

This project was carried out as part of the TechLabs “Digital Shaper Program” in Düsseldorf (Summer Term 2022).

Abstract

Anyone who wants to buy a car is often faced with a difficult decision and too many choices. In addition, some increasingly important criteria such as budget and emissions cannot yet be taken into account in most tools. We would like to give users a tool that takes these factors into account and helps them find a suitable car.

To use the tool, the user answers a few questions. These preferences are used to exclude unsuitable vehicles from the database. The processed selection is then returned to the user as a graphical and textual result.

Introduction

Imagine you need a new car and you have to choose which one to buy. You have to make the right decision, because cars are expensive and your budget is limited. All the important criteria have to fit — and that doesn’t include the date of first registration, the mileage for used cars or the payment method as offered by traditional search engines.

Instead, sustainability is what counts in today’s world. Your new car should be as environmentally friendly as possible, while also offering practicality. After all, your whole family has to drive the car every day. Additionally, your budget is limited, so you can’t afford just any car. Not only does the one-time purchase cost money, but there are also running costs such as gasoline. Another factor is personal preference. In some brands you just don’t want to be seen.

Method

After we found one very good dataset, we didn’t have to look for a second or third. Instead, we could focus on this one, which was already very clean. For example, there were no missing values, we only had to remove duplicates. We then discovered the dataset using descriptive statistics and visualization of the data.

Our next step was creating the user input. We considered 5 variables for which we formulated the following questions:

● How many people typically use your car at once?

● How many kilometers do you drive weekly?

● Enter your favorite transmission type (AUTOMATIC or MANUAL). Type NO if you don’t have a favorite type.

● Is there any brand you would like to exclude from the selection? If you want to exclude multiple brands, enter them separated by space. Type NO if you don’t want to exclude a brand.

● What is your weekly budget for fuel (in €)?

Then the inputs are processed. The cars that do not match the settings are removed from the dataframe. First, we set a number of seats for each car class and append them to the dataframe. Then we drop all the cars that have less seats than desired by the user, and the cars that have much more (+2) seats than desired. They are equally inappropriate. CO2 emissions and fuel consumption are also calculated. However, we can’t exclude any cars based on these values. They are only used when we later compare the group of selected cars with the others. Furthermore, the cars with unsuitable gears and undesirable brands are removed. In order to take into account the weekly budget, we researched the current prices of the fuels and calculated the weekly cost taking into account the weekly driven kilometers.

The final step was to visualize the results. In this process, we compare the cars that fit the desired criteria with the totality of cars and indicate the makes and models so that the user can specifically search for offers of these cars.

Result

In our tool, the result is the remaining cars after processing the input. Of course, the solution set varies based on the user’s input. In all cases we tested with realistic input, 90%-95% of the cars could be eliminated. This is a great help for the user.

For the above example, we see in the plot by (CO2/km and fuel/100km) that our results are very compact together. We deduce from this that the tool produces meaningful results that reliably represent a group of matching cars.

With regard to the initial goal of enabling users to orient their car purchase to criteria that are important to them, we believe that we have created an initial proof of concept with this project. We are aware that the underlying data set allows filtering according to only a few criteria. However, we do not see any technical limitations here. A combination with existing model databases could easily scale the solution here. In the course of the project, we had to compensate for shortfalls in the project team. Therefore, looking back, we are satisfied with the result of our work. We both learned a lot and gained experience. Nevertheless, we are aware that the requirements could have been better implemented under different circumstances. But that is a task for next year!

Find our project on Google Colab

The Team:

Tobias Demming: Data Science Track

Lara Gerlach: Data Science Track

--

--

No responses yet