python 使用celery实现定时任务

python使用celery实现定时任务

celery是python的第三方包,在django中也可以使用,只不过需要手动进行配置相关配置信息。

很多情况下需要定时的执行某些任务,例如我的博客中,需要每天凌晨5点更新当天的笔记记录情况,主要用于热点图的可视化工作。选定某一个时间更新,加上增加缓存就可以有效的降低服务器的压力。

或者在其他业务中也经常用到定时任务进行数据库的维护或者定时爬虫,生日定时消息推送等等。接下来就一起学习一下celery的定时任务的实现。

假设你已经写好了celery任务函数。

对定时任务代码简单实现

代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

from celery.schedules import crontab
# 以下是我每天凌晨自动更新文章发布数量的定时任务的配置,在setting.py中进行配置

# celery时区设置,使用settings中TIME_ZONE同样的时区,关系到定时任务是否能够准确触发
CELERY_TIME_ZONE = TIME_ZONE

# 定时任务
CELERYBEAT_SCHEDULE = {
'update_note': { # 每天零点触发的任务函数
'task': 'mainsite.tasks.update_daily_note',
'schedule': crontab(minute=0, hour=0),
},
'add-every-30-seconds': { # 每个30s触发的测试函数
'task': 'mainsite.tasks.tests_Periodic',
'schedule': 30.0,
'args': (),
},
}

说明:为了让任务能够一直被监听到,我们需要一直与celery保持连接,所以我们需要启动一个心跳包(beat)。

启动指令:celery beat -A project -l info

启动成功,会显示如下信息:

1
2
3
4
5
6
7
8
9
10
11
celery beat v4.4.2 (cliffs) is starting.
__ - ... __ - _
LocalTime -> 2020-04-25 21:16:39
Configuration ->
. broker -> redis://:**@127.0.0.1:6379/0
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]@%INFO
. maxinterval -> 5.00 minutes (300s)

beat心跳包的原理是每隔一段时间告诉服务器,我还活着,不要断开我哦。

因为服务器一般会在一个Timer事件中,向客户端发送一个很小的数据包,然后启动一个低级别的线程,这个线程的作用是不断监听这个客户端是否还在连接中,客户端需要告知这个线程我还活着,这样就可以一直保持着连接。

注:启动成功后,一旦定时任务触发,如果有输出结果的话,会在celery的日志中出现。

Crontab schedules的各种形式

from celery.schedules import crontab

通过crontab设定不同时间的定时任务

下面是取自官方的example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
crontab() 	# Execute every minute.
crontab(minute=0, hour=0) # Execute daily at midnight.
crontab(minute=0, hour='*/3') # Execute every three hours: midnight, 3am, 6am, 9am, noon, 3pm, 6pm, 9pm.

crontab(minute=0,hour='0,3,6,9,12,15,18,21')
crontab(minute='*/15') # Execute every 15 minutes.
crontab(day_of_week='sunday') # Execute every minute (!) at Sundays.

crontab(minute='*', hour='*', day_of_week='sun') # Same as previous.

crontab(minute='*/10', hour='3,17,22', day_of_week='thu,fri') # Execute every ten minutes, but only between 3-4 am, 5-6 pm, and 10-11 pm on Thursdays or Fridays.
crontab(minute=0, hour='*/2,*/3') # Execute every even hour, and every hour divisible by three. This means: at every hour except: 1am, 5am, 7am, 11am, 1pm, 5pm, 7pm, 11pm
crontab(minute=0, hour='*/5') # Execute hour divisible by 5. This means that it is triggered at 3pm, not 5pm (since 3pm equals the 24-hour clock value of “15”, which is divisible by 5).
crontab(minute=0, hour='*/3,8-17') # Execute every hour divisible by 3, and every hour during office hours (8am-5pm).
crontab(0, 0, day_of_month='2') # Execute on the second day of every month.

crontab(0, 0, day_of_month='2-30/2') # Execute on every even numbered day.

crontab(0, 0, day_of_month='1-7,15-21') # Execute on the first and third weeks of the month.

crontab(0, 0, day_of_month='11', month_of_year='5') # Execute on the eleventh of May every year.

crontab(0, 0, month_of_year='*/3') # Execute every day on the first month of every quarter.

还有些其他的要点,可以参考官方文档:

https://docs.celeryproject.org/en/stable