Austrian Hotels Dataset
Info
This dataset contains realistic data on hotels across Austria.
- This dataset was generated programmatically with the
generate_austrian_hotels_data.R
script to ensure realistic relationships between variables while maintaining privacy.- The scipt was writen by Claude AI, Sonnet 3.7, 2025-03-15, and reviwed and approved by Gabor 2025-03-17
- The dataset consists of multiple related tables that can be combined.
- The data patterns are based on typical hotel industry metrics but do not represent actual hotels.
Dataset Overview
The dataset includes hotels across Austrian cities with data on occupancy, pricing, tourism statistics, and economic indicators.
Files
All files are located in the data/raw/
directory:
File | Description | Rows | Key Columns |
---|---|---|---|
hotels.csv |
Basic hotel information | 200 | hotel_id (PK) |
cities.csv |
City information | 10 | city (PK) |
monthly_occupancy.csv |
Monthly hotel performance metrics | ~3,800 | hotel_id , month , year |
city_tourism.csv |
Monthly tourism statistics by city | 240 | city , month , year |
economic_indicators.csv |
Monthly economic indicators | 24 | month , year |
reviews.csv |
Hotel guest reviews | ~1,700 | review_id (PK), hotel_id (FK) |
amenities.csv |
List of possible hotel amenities | 10 | amenity_id (PK) |
hotel_amenities.csv |
Hotel-amenity relationships | ~1,000 | hotel_id , amenity_id |
## Schema Details
### hotels.csv Information about individual hotels.
Column | Type | Description |
---|---|---|
hotel_id |
integer | Primary key |
hotel_name |
character | Hotel name |
city |
character | City where hotel is located |
star_rating |
integer | Hotel quality rating (3-5 stars) |
rooms |
integer | Number of rooms in the hotel |
year_built |
integer | Year the hotel was built |
### cities.csv Information about Austrian cities.
Column | Type | Description |
---|---|---|
city |
character | City name (primary key) |
province |
character | Austrian province |
population |
integer | City population |
tourism_rank |
integer | Tourism popularity rank (1 = highest) |
### monthly_occupancy.csv Monthly hotel performance metrics.
Column | Type | Description |
---|---|---|
hotel_id |
integer | Foreign key to hotels.csv |
month |
integer | Month (1-12) |
year |
integer | Year (2023-2024) |
occupancy_rate |
numeric | Percentage of rooms occupied (0.0-1.0) |
avg_daily_rate |
numeric | Average price per night in EUR |
revenue_per_room |
numeric | Revenue per available room (RevPAR) |
### city_tourism.csv Monthly tourism statistics for each city.
Column | Type | Description |
---|---|---|
city |
character | City name |
month |
integer | Month (1-12) |
year |
integer | Year (2023-2024) |
tourist_arrivals |
integer | Number of tourists arriving |
event_days |
integer | Number of event days in the month |
avg_stay_length |
numeric | Average length of stay in days |
### economic_indicators.csv Monthly economic indicators for Austria.
Column | Type | Description |
---|---|---|
month |
integer | Month (1-12) |
year |
integer | Year (2023-2024) |
inflation_rate |
numeric | Monthly inflation rate (decimal) |
unemployment |
numeric | Unemployment rate (decimal) |
consumer_confidence |
numeric | Consumer confidence index |
### reviews.csv Hotel guest reviews.
Column | Type | Description |
---|---|---|
review_id |
integer | Primary key |
hotel_id |
integer | Foreign key to hotels.csv |
rating |
numeric | Rating (1.0-5.0) |
review_date |
date | Date of the review |
### amenities.csv List of possible hotel amenities.
Column | Type | Description |
---|---|---|
amenity_id |
integer | Primary key |
amenity_name |
character | Name of the amenity |
### hotel_amenities.csv Many-to-many relationship between hotels and amenities.
Column | Type | Description |
---|---|---|
hotel_id |
integer | Foreign key to hotels.csv |
amenity_id |
integer | Foreign key to amenities.csv |