mysql - how do i count the number of data in a column for each month using a start and end date -
i have table similar information , want extract data when user selects start date '2015-01-22' , end date '2015-07-31' . result should this.
month total quantity january: 8 february: 6 march: 0 april: 0 may: 2 june: 18 july: 6
here's sample query , fiddle
create table orders ( id int primary key auto_increment, order_date date, product_id int, quantity int, customer_id int ); insert orders (order_date, product_id, quantity, customer_id) values ('2015-01-01', 1, 2, 123), ('2015-01-06', 3, 6, 123), ('2015-02-14', 2, 4, 123), ('2015-02-15', 2, 2, 123), ('2015-05-16', 1, 1, 456), ('2015-05-17', 1, 1, 456), ('2015-06-18', 1, 5, 789), ('2015-06-18', 3, 7, 123), ('2015-06-10', 3, 6, 123), ('2015-07-13', 1, 5, 456), ('2015-07-14', 1, 1, 456);
http://sqlfiddle.com/#!2/01ac19/1
the results should total number of quantity of orders per month
first, want need known "calendar table". are, hands down, useful analysis table can make. individual definition , data fill varies, , won't covered here, our purposes, we'll use following minimum definition:
create table calendar (calendardate date primary key, year integer, month integer dayofmonth integer);
... , it's filled data expect (insert every single date when business started, reasonable point in future). want indices on - lots of indices.
next, need consider important databases: can't use indices if function output used criteria. basically, if it's not in select
clause, using function (even via implicit casts) makes query slower. so, doing things year(order_date)
should avoided.
how aggregate things year or month? via ranges queries. if database has index, it's pretty cheap start , end of range (and nicely parallelizable, too). in our case, range >= startofmonth
< startofnextmonth
. can build in-process range table:
select year, month, calendardate monthstart, calendardate + interval 1 month nextmonthstart calendar dayofmonth = 1 , calendardate >= :querystartrange , calendardate < :queryendrange
... :
denoting start-of-month values, left exercise reader.
now, remember how said "no functions"? calendardate + interval 1 month
counts. however, it's not going matter here; resulting table small (just 12 rows per year!) rdbmss can place contents in memory faster results (because take longer hit index).
now have our range-query table, can join orders
("fact") table;
select drange.year, drange.month, sum(orders.quantity) total_quantity (select year, month, calendardate monthstart, calendardate + interval 1 month nextmonthstart calendar dayofmonth = 1 , calendardate >= :querystartrange , calendardate < :queryendrange) drange join orders on orders.order_date >= drange.monthstart , orders.order_date < drange.nextmonthstart group drange.year, drange.month order drange.year, drange.month
example fiddle
(fun trick: using left join
instead of join
net null-quantity rows if month has no orders - march , april in example data)
so us? range-query access @ base data, make faster query. if, reason, order_date
gets turned timestamp, query safe - we'll correctly orders, , put them in proper months.
Comments
Post a Comment