by Professor Petabyte
SQL - Structured Query Language (commonly pronounced 'sequel') is arguably the best tool for analysis of data sets from a few items to billions of item of data. If this is news to you, then the rest of this page is likely to be way over your head for the time being, because only those with a reasonable working knowledge of SQL are likely to find this useful - but Google is your friend, and can take you to millions of websites where you can learn the basics.
Assuming you have the basics, and understand this
SELECT COUNT(id) AS AllCars FROM motor_fleet;please read on, considering this table of data to go with that query. (Here are just the first 3 of 100 rows, the full data set is at the bottom of this page. Consider `id` to be a primary key, unique, auto number, not NULL field.)
id | make | power | year_first_reg |
---|---|---|---|
1 | BMW | Petrol | 2015 |
2 | Chrysler | Petrol | 2016 |
3 | Ford | Petrol | 2016 |
... | ... | ... | ... |
If you understand the SELECT and COUNT statements in SQL, you'll have worked out that this query will count the number of records in the table `motor_fleet`, and return that number labelled AllCars. This is pretty simple.
AllCars |
---|
100 |
One might also want to make selective queries, for example to look at makes, types of power, or how old the cars are:-
SELECT COUNT(*) AS Fords FROM motor_fleet WHERE `make` = 'Ford';
SELECT COUNT(*) AS OldCar FROM motor_fleet WHERE `year_first_reg` < 2018;
SELECT COUNT(*) AS Petrol FROM motor_fleet WHERE `power` = 'Petrol';
Fords |
---|
36 |
OldCar |
---|
46 |
Petrol |
---|
67 |
This is kind of OK, but running the queries one by one, gives separate answers, and some databases like LibreOffice Base for example, only allow one SQL query at a time, and we haven't started on combined queries yet.
SELECT COUNT(*) AS OldCar_Fords FROM motor_fleet
WHERE `make` = 'Ford' AND `year_first_reg` < 2018;
SELECT COUNT(*) AS Petrol_Fords FROM motor_fleet
WHERE `make` = 'Ford' AND `power` = 'Petrol' ;
SELECT COUNT(*) AS OldCar_Petrol FROM motor_fleet
WHERE `year_first_reg` < 2018 AND `power` = 'Petrol' ;
SELECT COUNT(*) AS OldCar_Petrol_Fords FROM motor_fleet
WHERE `year_first_reg` < 2018 AND `make` = 'Ford' AND `power` = 'Petrol';
OldCar_Fords |
---|
18 |
Petrol_Fords |
---|
24 |
OldCar_Petrol |
---|
32 |
OldCar_Petrol_Fords |
---|
12 |
What a mess!! Now we have to run four queries, and then manually collate the results to form a single dataset. There is a better way..........
SELECT COUNT(*) AS AllCars, COUNT(Fords) AS Fords, COUNT(OldCar) AS OldCar, COUNT(Petrol) AS Petrol, COUNT(Fords + OldCar) AS OldCar_Fords, COUNT(Fords + Petrol) AS Petrol_Fords, COUNT(OldCar + Petrol) AS Petrol_OldCar, COUNT(OldCar + Petrol + Fords) AS Petrol_OldCar_Fords FROM ( SELECT CASE WHEN `make` = 'Ford' THEN 1 END AS Fords, CASE WHEN `power` = 'Petrol' THEN 1 END AS Petrol, CASE WHEN `year_first_reg` < 2018 THEN 1 END AS OldCar FROM motor_fleet ) AS Multiple_Combination_Aggregates
This gives the following result...
AllCars | Fords | OldCar | Petrol | OldCar_Fords | Petrol_Fords | Petrol_OldCar | Petrol_OldCar_Fords |
---|---|---|---|---|---|---|---|
100 | 36 | 46 | 67 | 18 | 24 | 32 | 12 |
... and thus we have all of the aggregates in one dataset, from one query, and each one meaningfully titled.
Here is the data used for this example. It is given in ordinary .csv format so you can just copy it straight from the screen into a .csv file or your own database.
Table Name : motor_fleet Data: id,make,power,year_first_reg 1,BMW,Petrol,2015 2,Chrysler,Petrol,2016 3,Ford,Petrol,2016 4,Mercedes,Petrol,2021 5,Ford,Petrol,2021 6,Chrysler,Petrol,2015 7,Chrysler,Petrol,2020 8,Ford,Petrol,2019 9,Mercedes,Petrol,2020 10,Chrysler,Petrol,2017 11,Chrysler,Petrol,2020 12,Chrysler,Petrol,2020 13,Chrysler,Petrol,2015 14,Toyota,Petrol,2020 15,Chrysler,Petrol,2017 16,Ford,Petrol,2018 17,Mercedes,Petrol,2015 18,Chrysler,Petrol,2020 19,Ford,Petrol,2021 20,Toyota,Petrol,2018 21,Chrysler,Petrol,2017 22,Ford,Petrol,2018 23,Chrysler,Petrol,2018 24,BMW,Petrol,2015 25,Ford,Petrol,2017 26,Chrysler,Petrol,2021 27,Chrysler,Petrol,2016 28,Mercedes,Petrol,2021 29,Chrysler,Petrol,2021 30,Toyota,Petrol,2017 31,Mercedes,Petrol,2018 32,Ford,Petrol,2018 33,Ford,Petrol,2016 34,Chrysler,Petrol,2017 35,Ford,Petrol,2019 36,Chrysler,Petrol,2015 37,Ford,Petrol,2016 38,Chrysler,Petrol,2018 39,Ford,Petrol,2021 40,Ford,Petrol,2021 41,Chrysler,Petrol,2015 42,Chrysler,Petrol,2021 43,Ford,Petrol,2018 44,Ford,Petrol,2019 45,BMW,Petrol,2020 46,Chrysler,Petrol,2016 47,Ford,Petrol,2015 48,Ford,Petrol,2015 49,Chrysler,Petrol,2015 50,Ford,Petrol,2019 51,Chrysler,Petrol,2020 52,BMW,Petrol,2015 53,Chrysler,Petrol,2015 54,Ford,Petrol,2016 55,BMW,Petrol,2017 56,Ford,Petrol,2015 57,Chrysler,Petrol,2020 58,Ford,Petrol,2017 59,Ford,Petrol,2015 60,Ford,Petrol,2016 61,Chrysler,Petrol,2018 62,Mercedes,Petrol,2021 63,Mercedes,Petrol,2019 64,Chrysler,Petrol,2016 65,Mercedes,Petrol,2021 66,Mercedes,Petrol,2019 67,Ford,Petrol,2017 68,Mercedes,Diesel,2017 69,Ford,Diesel,2020 70,BMW,Diesel,2018 71,Chrysler,Diesel,2016 72,Chrysler,Diesel,2018 73,Ford,Diesel,2021 74,Chrysler,Diesel,2017 75,Chrysler,Diesel,2015 76,Ford,Diesel,2015 77,Toyota,Electric,2021 78,Ford,Electric,2020 79,Chrysler,Electric,2015 80,Chrysler,Electric,2021 81,BMW,Electric,2017 82,Ford,Electric,2016 83,Mercedes,Electric,2021 84,Ford,Electric,2019 85,Mercedes,Electric,2018 86,Chrysler,Electric,2019 87,Ford,Electric,2016 88,Mercedes,Electric,2017 89,Mercedes,Electric,2021 90,Chrysler,Electric,2021 91,Chrysler,Electric,2019 92,Ford,Electric,2017 93,Ford,Electric,2016 94,BMW,Electric,2016 95,Ford,Electric,2018 96,Mercedes,Electric,2018 97,Ford,Electric,2020 98,BMW,Electric,2019 99,Chrysler,Electric,2020 100,Ford,Electric,2016