重弹Transact-SQL的随机数

Transact-SQL的随机数主要有:

RAND:返回一个介于 0 到 1(不包括 0 和 1)之间的伪随机 float 值。如果未指定 seed,则 SQL Server 数据库引擎随机分配种子值。 使用同一个种子值重复调用 RAND() 会返回相同的结果。对于一个连接,如果使用指定的种子值调用 RAND(),则 RAND() 的所有后续调用将基于使用该指定种子值的 RAND() 调用生成结果。

NEWID:创建 uniqueidentifier 类型的唯一值。NEWID 对每台计算机返回的值各不相同。

CHECKSUM:返回按照表的某一行或一组表达式计算出来的校验和值,用于生成哈希索引。CHECKSUM 值取决于排序规则。使用不同排序规则存储的相同值将返回一个不同的 CHECKSUM 值。(竟然可以把字符串转为数字!!!)

因为RAND()使用同一个种子值重复调用 RAND() 会返回相同的结果,所以会造成了一个后果,SELECT到所有列全部是同一个值,如果这个值需要同其它值进行计算,那就无法真正的随机了。

此时就需要CHECKSUM(NEWID())甚至RAND(CHECKSUM(NEWID()))出场了。

具体怎么用,多试就知道了。

[抛砖引玉]换个思路解决SQL经典问题(二):时间区间-按年分月统计

声明:本篇文章的SQL语句为了体现作者的思路,并非最优,请根据实际需要进行优化。

曾经帮别人解决一个这样的问题:

e_class_table

CREATE TABLE [dbo].[E_Classes] ([ClassID] varchar(12) NOT NULL ,[BeginDate] date NULL ,[EndDate] date NULL ,[StudentCount] smallint NULL);
INSERT INTO [dbo].[E_Classes] ([ClassID], [BeginDate], [EndDate], [StudentCount]) VALUES (N'01', N'2011-10-07', N'2012-03-11', N'49');
INSERT INTO [dbo].[E_Classes] ([ClassID], [BeginDate], [EndDate], [StudentCount]) VALUES (N'02', N'2012-04-05', N'2012-06-20', N'23');
INSERT INTO [dbo].[E_Classes] ([ClassID], [BeginDate], [EndDate], [StudentCount]) VALUES (N'03', N'2012-05-01', N'2012-07-03', N'46');
INSERT INTO [dbo].[E_Classes] ([ClassID], [BeginDate], [EndDate], [StudentCount]) VALUES (N'04', N'2013-03-08', N'2013-06-12', N'42');
INSERT INTO [dbo].[E_Classes] ([ClassID], [BeginDate], [EndDate], [StudentCount]) VALUES (N'05', N'2013-09-19', N'2013-10-26', N'78');
INSERT INTO [dbo].[E_Classes] ([ClassID], [BeginDate], [EndDate], [StudentCount]) VALUES (N'06', N'2013-11-21', N'2014-01-09', N'44');
INSERT INTO [dbo].[E_Classes] ([ClassID], [BeginDate], [EndDate], [StudentCount]) VALUES (N'07', N'2013-12-27', N'2014-02-06', N'36');
INSERT INTO [dbo].[E_Classes] ([ClassID], [BeginDate], [EndDate], [StudentCount]) VALUES (N'08', N'2014-02-28', N'2014-06-20', N'52');

要求按年和月分别统计全部课程的统计每个月的人天次,即:

class_total

计算方法为 (每门课在当月的有效天数*人数)的总和,如2014年01月的应该为 课程06的 2014-01-01~2014-01-09共计9天*44人=396人天 再加上 课程07的2014-01-01~2014-01-31共计31天*36人=1116人天,那么2014年01月的人天为 396+1116=1512人天次。

这个问题看上去挺麻烦的,但是我们可以换个思路去考虑,先建一张序号表:

CREATE TABLE [dbo].[E_Sequence]([Sequence_ID] [int] IDENTITY(1,1) NOT NULL, CONSTRAINT [PK_E_Sequence] PRIMARY KEY CLUSTERED ([Sequence_ID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]);
SET IDENTITY_INSERT [dbo].[E_Sequence] ON
INSERT INTO [dbo].[E_Sequence] (Sequence_ID)
SELECT number FROM [master]..[spt_values] WHERE  type = 'P'
UNION ALL
SELECT number + 2048 FROM [master]..[spt_values] WHERE  type = 'P'
SET IDENTITY_INSERT [dbo].[E_Sequence] OFF

那。。。这张序号表是干什么用的呢?看看下面就知道了:

SELECT Dateadd(day, Sequence_id, '2011-1-1') AS Class_Date FROM E_Sequence

通过这个语句,我们可以生成一个从2011年01月01日开始的4000天,我们将用到这个查询来将课程表的时间段,拆分到每一天去:

SELECT Year(Class_Date)  AS Class_Year,
       Month(Class_Date) AS Class_Month,
       StudentCount
FROM   (SELECT Dateadd(day, Sequence_id, '2011-1-1') AS Class_Date
        FROM   E_Sequence) t1,
       E_Classes t2
WHERE  t1.Class_Date BETWEEN t2.BeginDate AND t2.EndDate

看看效果:

class_split

OK,这样,是不是就很明了了呢?“年”字段有了,“月”字段有了,“学生数”是不是就可以简单地加了?然后再一个行列转换就可以实现了(SQL2000请自行转为Case When End):

SELECT Class_Year      AS ??,
       Isnull([1], 0)  AS ??,
       Isnull([2], 0)  AS ??,
       Isnull([3], 0)  AS ??,
       Isnull([4], 0)  AS ??,
       Isnull([5], 0)  AS ??,
       Isnull([6], 0)  AS ??,
       Isnull([7], 0)  AS ??,
       Isnull([8], 0)  AS ??,
       Isnull([9], 0)  AS ??,
       Isnull([10], 0) AS ??,
       Isnull([11], 0) AS ???,
       Isnull([12], 0) AS ???
FROM   (SELECT Year(Class_Date)  AS Class_Year,
               Month(Class_Date) AS Class_Month,
               studentcount
        FROM   (SELECT Dateadd(day, Sequence_id, '2011-1-1') AS Class_Date
                FROM   E_Sequence) t1,
               e_classes t2
        WHERE  t1.Class_Date BETWEEN t2.begindate AND t2.enddate) t 
        PIVOT (Sum(studentcount) FOR Class_Month IN
        ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12])) tpivot

看到了吧?有时,引入一张外表,可以让我们的思路更清楚。

如果将

SELECT Dateadd(day, Sequence_id, '2011-1-1') AS Class_Date FROM E_Sequence

这句生成的查询,直接做成一张日期的序列表再加上索引的话,那在性能上会快更多,具体优化就不多讲了。

继续留个尾巴:

请只统计2012年07月16日到2013年06月15日之间,排除双休日的每个月的人天次。

[抛砖引玉]换个思路解决SQL经典问题(一):按每半小时统计

声明:本篇文章的SQL语句为了体现作者的思路,并非最优,请根据实际需要进行优化。

在开发的过程中,我们常常会遇到类似这种按每半小时统计数量的需求:

CREATE TABLE [dbo].[PerHalfHour]([CallInTime] [smalldatetime] NULL, [Result] [int] NULL);
INSERT [dbo].[PerHalfHour] ([CallInTime], [Result]) VALUES ('2014-01-01 08:20:00', 2);
INSERT [dbo].[PerHalfHour] ([CallInTime], [Result]) VALUES ('2014-01-01 08:38:00', 2);
INSERT [dbo].[PerHalfHour] ([CallInTime], [Result]) VALUES ('2014-01-01 08:43:00', 2);
INSERT [dbo].[PerHalfHour] ([CallInTime], [Result]) VALUES ('2014-01-01 08:49:00', 2);
INSERT [dbo].[PerHalfHour] ([CallInTime], [Result]) VALUES ('2014-01-01 09:07:00', 2);
INSERT [dbo].[PerHalfHour] ([CallInTime], [Result]) VALUES ('2014-01-01 09:07:00', 1);
INSERT [dbo].[PerHalfHour] ([CallInTime], [Result]) VALUES ('2014-01-01 09:10:00', 1);
INSERT [dbo].[PerHalfHour] ([CallInTime], [Result]) VALUES ('2014-01-01 09:16:00', 1);
INSERT [dbo].[PerHalfHour] ([CallInTime], [Result]) VALUES ('2014-01-01 09:41:00', 1);
INSERT [dbo].[PerHalfHour] ([CallInTime], [Result]) VALUES ('2014-01-01 09:44:00', 1);
INSERT [dbo].[PerHalfHour] ([CallInTime], [Result]) VALUES ('2014-01-01 09:52:00', 1);

只统计其中的一部分并不难:

SELECT CallInTime_Hour,
       CallInTime_Minute,
       Count(*) AS CallInTime_Count
FROM   (SELECT Datepart(HOUR, CallInTime)             AS CallInTime_Hour,
               Datepart(MINUTE, CallInTime) / 30 * 30 AS CallInTime_Minute
			   /*??SQL ??/??=?? ??????????*/
        FROM   [PerHalfHour]
        WHERE  CallInTime >= '2014-01-01'
               AND CallInTime < '2014-01-02') t
GROUP  BY CallInTime_Hour,
          CallInTime_Minute
ORDER  BY CallInTime_Hour,
          CallInTime_Minute

统计结果为:


但是实际需求可能要求显示全部24小时的统计,这时我们只要加一张外联表就可以了:

CREATE TABLE [dbo].[Sequence_HalfHour]([sequence_time] [nvarchar](20) NULL,[sequence_hour] [int] NULL,[sequence_minute] [int] NOT NULL);
INSERT [dbo].[Sequence_HalfHour] ([sequence_time], [sequence_hour], [sequence_minute]) VALUES (N'00:00:00~00:29:59', 0, 0);
INSERT [dbo].[Sequence_HalfHour] ([sequence_time], [sequence_hour], [sequence_minute]) VALUES (N'00:30:00~00:59:59', 0, 30);
INSERT [dbo].[Sequence_HalfHour] ([sequence_time], [sequence_hour], [sequence_minute]) VALUES (N'01:00:00~01:29:59', 1, 0);
INSERT [dbo].[Sequence_HalfHour] ([sequence_time], [sequence_hour], [sequence_minute]) VALUES (N'01:30:00~01:59:59', 1, 30);
...
INSERT [dbo].[Sequence_HalfHour] ([sequence_time], [sequence_hour], [sequence_minute]) VALUES (N'23:00:00~23:29:59', 23, 0);
INSERT [dbo].[Sequence_HalfHour] ([sequence_time], [sequence_hour], [sequence_minute]) VALUES (N'23:30:00~23:59:59', 23, 30);

然后用这个外联表去Left Join上一个查询就OK:

SELECT t0.sequence_time            AS ???,
       Isnull(CallInTime_Count, 0) AS ??
FROM   sequence_halfhour t0
       LEFT JOIN (SELECT CallInTime_Hour,
                         CallInTime_Minute,
                         Count(*) AS CallInTime_Count
                  FROM   (SELECT Datepart(HOUR, CallInTime)             AS CallInTime_Hour,
                                 Datepart(MINUTE, CallInTime) / 30 * 30 AS CallInTime_Minute
                          /*??SQL ??/??=?? ??????????*/
                          FROM   [PerHalfHour]
                          WHERE  CallInTime >= '2014-01-01'
                                 AND CallInTime < '2014-01-02') t
                  GROUP  BY CallInTime_Hour,
                            CallInTime_Minute) t1
         ON t0.sequence_hour = t1.CallInTime_Hour
            AND t0.sequence_minute = t1.CallInTime_Minute

查看结果:

留个尾巴:

第一张表里有个Result字段,请用SQL语句输出以下表格:

callintime_result