Q151. The data engineering team is using a bunch of SQL queries to review data quality and monitor the ETL job every day, which of the following approaches can be used to set up a schedule and auto-mate this process?
Explanation
Explanation
Individual queries can be refreshed on a schedule basis,
To set the schedule:
1. Click the query info tab.
Graphical user interface, text, application, email Description automatically generated
* Click the link to the right of Refresh Schedule to open a picker with schedule intervals.
Graphical user interface, application Description automatically generated
* Set the schedule.
The picker scrolls and allows you to choose:
* An interval: 1-30 minutes, 1-12 hours, 1 or 30 days, 1 or 2 weeks
* A time. The time selector displays in the picker only when the interval is greater than 1 day and the day selection is greater than 1 week. When you schedule a specific time, Databricks SQL takes input in your computer’s timezone and converts it to UTC. If you want a query to run at a certain time in UTC, you must adjust the picker by your local offset. For example, if you want a query to execute at 00:00 UTC each day, but your current timezone is PDT (UTC-7), you should select 17:00 in the picker:
Graphical user interface Description automatically generated
* Click OK.
Your query will run automatically.
If you experience a scheduled query not executing according to its schedule, you should manually trigger the query to make sure it doesn’t fail. However, you should be aware of the following:
* If you schedule an interval-for example, “every 15 minutes”-the interval is calculated from the last successful execution. If you manually execute a query, the scheduled query will not be executed until the interval has passed.
* If you schedule a time, Databricks SQL waits for the results to be “outdated”. For example, if you have a query set to refresh every Thursday and you manually execute it on Wednesday, by Thursday the results will still be considered “valid”, so the query wouldn’t be scheduled for a new execution. Thus, for example, when setting a weekly schedule, check the last query execution time and expect the scheduled query to be executed on the selected day after that execution is a week old. Make sure not to manually execute the query during this time.
If a query execution fails, Databricks SQL retries with a back-off algorithm. The more failures the further away the next retry will be (and it might be beyond the refresh interval).
Refer documentation for additional info,
https://docs.microsoft.com/en-us/azure/databricks/sql/user/queries/schedule-query