← Go back to projects

What you selling?

'

This undertaking was part of the core course Intro to Data Science. The course work was primarily taught in R, for student from various disciplines to be able to grasp the concept of data science.

The project's purpose was to uncover trends in a dataset of over a hundred hotel bookings that influence whether or not a guest cancels their reservation. Understanding the elements that cause cancellations and whether we can anticipate whether a booking will be canceled.

The project was required to be performed primarily in R but given the background of most team members we decided to use python for advanced visualization.

This was allowed as part of the class, since we had kept the key analysis in R, which was in spirit of the class requirement. The technologies used to accomplish these were mainly the two programming languages and seaborn as a specialized high level library for visualization. According to the data, the hotel's main demographics are bookings with two guests, followed by singles and then families. The lead time was the key factor that had a significant impact on cancellation. The likelihood of a booking being canceled grew as the period between booking and check-in date increased. When deposit type was evaluated, there was a significant rate of cancellation when the deposit type was non-refundable, which contradicted intuition. This was spread out over the entire lead time as opposed to a specific timeframe. There was also a significant influx from the market segment of Online Travel agency, which accounted for the majority of bookings, followed by direct bookings.

Following our analysis in the end we focused on a few recommendations for the hospitality, the top three were as follows:

1. Reducing the lead time (time between booking and check-in)
2. Increasing the number of room type A.
3. Focusing on corporate clients.

Some previews from final presentation