View the original community article here
Last tested: Sep 17, 2020
In order to schedule content with a datagroup, the datagroup will need to be defined explicitly in the model file or written in a .lookml file and included in the model file.
1) Using just sql_trigger: use this to force a cache reset and/or drop and rebuild a PDT at a specific time.
2) Using just max_cache_age: doing this will not cache a query unless you manually run the content, since nothing's automating the query-running process (see the Q&A below). Otherwise, use this to create a cache reset policy or policy at the model or explore level to override the default 1 hour cache that Looker uses.
3) Using both in combination: the primary use case I have seen is to ensure a scheduled dashboard will run with fresh data. for example schedule a dashboard to run at 7:01am. Set a datagroup with a sql_trigger for 7:00am and max_cache_age for 10 hours. This will ensure that when that dashboard runs at 7:01am the cache will be reset and then anyone loading that dashboard during business hours before 5:01pm will get instant cached results. If someone loads the dashboard later that night at say 11:00pm these new cached results would be invalidated the next morning at 7:00am and the 7:01am scheduled run would be guaranteed to have fresh data.
Q: If I set my max_cache age to 1 hour, does that mean my schedule will run every hour?
A: No, the way that max_cache_age seems to work, it doesn't get checked or triggered directly (instead, it is assessed at the time when a user tries to use an item) and so it's not going to actively cause something to happen when it 'would be expired'. If you want to trigger schedules to run hourly, you would instead create a sql_trigger based data_group that triggers hourly, and use that on your dashboard schedule.