This page explains how to set up a connection in Looker to Google BigQuery Standard SQL or Google BigQuery Legacy SQL.
The general steps for setting up a Google BigQuery Standard SQL or Google BigQuery Legacy SQL connection are:
- Create a service account with access to the Google project and download the JSON credentials certificate.
- Create a temporary dataset for storing persistent derived tables.
- Set up the Looker connection to your database.
- Test the connection.
Creating a service account and downloading the JSON credentials certificate
Open the credentials page in the Google Cloud Platform API Manager and, if necessary, select your project:
Click CREATE CREDENTIALS and choose Service account:
Enter a name for the new service account, optionally add a description, and click CREATE:
Your service account requires two Google BigQuery predefined roles:
- BigQuery > BigQuery Data Editor
- BigQuery > BigQuery Job User
Select the first role in the Select a role field, then click ADD ANOTHER ROLE and select the second role:
After selecting both roles, click CONTINUE:
Click CREATE KEY:
Select JSON and click CREATE:
The JSON key will be saved to your computer. BE SURE TO REMEMBER WHERE IT IS SAVED. After noting the download location, click CLOSE:
Find the email address corresponding to the service account. You will need this to configure the Looker connection to BigQuery.
Creating a temporary dataset for persistent derived tables
If you enable persistent derived tables (PDT) for your BigQuery connection, Looker will display the Temp Database field. In this field, you’ll enter the dataset name that Looker should use to create PDTs. You should configure this database or schema ahead of time, with the appropriate write permissions.
You can set up a temporary dataset using the Google Cloud BigQuery console.
PDTs are not supported for Google BigQuery connections that use OAuth.
Creating a temporary dataset using the Google Cloud BigQuery console
Open the Google Cloud BigQuery console and select your project. Click the CREATE NEW DATASET button:
Give the dataset an ID (typically
looker_scratch) and select your Data location (optional), Default table expiration, and encryption key management solution. Click Create dataset.
Setting up the BigQuery connection in Looker
In the Admin section of Looker, select Connections, and then select New Connection.
Fill out the connection details. The majority of these settings are common to most database dialects and are described on the Connecting Looker to your database documentation page. The following settings are specific to BigQuery:
- Dialect: Select Google BigQuery Standard SQL or Google BigQuery Legacy SQL.
- Project ID: The Google Cloud Platform project ID.
Dataset: The name of the default dataset that you plan to use. If a table doesn’t have a dataset specified, then it is assumed to be in this dataset. (You can also model other datasets in this project.) This must match the name of a dataset in your BigQuery database.
- Use OAuth: Select this box to enable each Looker user to authenticate in to Google BigQuery and authorize Looker to run queries on the database with the user’s BigQuery account. See the Configuring OAuth for BigQuery connections section on this page for more information on implementing OAuth for your BigQuery connection.
- Service Account Email: The email address for the service account, as you verified in an earlier step. This field is not available for BigQuery connections that use OAuth.
- Service Account JSON/P12 File: The certificate file for the service account you downloaded in an earlier step. This field is not available for BigQuery connections that use OAuth.
- Password: If you’re using a legacy .p12 credentials file, enter the password to the .p12 credentials file. If you’re using a JSON credentials file, leave this field empty. This field is not available for BigQuery connections that use OAuth.
- Additional Params: Add any additional JDBC parameters, such as BigQuery labels (see the Job labels and context comments for BigQuery connections section on this documentation page for more information.) These are some of the other supported parameters:
connectTimeout: Number of milliseconds to wait for a connection. Defaults to 240000.
readTimeout: Number of milliseconds to wait for a read. Defaults to 240000.
rootUrl: If you have a BigQuery instance in a private network, specify an alternate endpoint to connect to BigQuery other than the default public endpoint.
- Temp Dataset: The BigQuery dataset you created earlier for writing persistent derived tables. PDTs are not supported and this field is not available for BigQuery connections that use OAuth.
- Max Billing Gigabytes: Leave blank for no limit. Read more details about pricing here.
- Max Connections: Can be left at the default value initially. Read more about this setting in the Max Connections section of the Connecting Looker to your database documentation page.
- Connection Pool Timeout: Can be left at the default value initially. Read more about this setting in the Connection Pool Timeout section of the Connecting Looker to your database documentation page.
- SQL Runner Precache: If you want SQL Runner to not preload table information and, instead, to load table information only when a table is selected, clear this option. Read more about this setting in the SQL Runner Precache section of the Connecting Looker to your database documentation page.
- Disable Context Comment: This option disables context comments on a BigQuery connection. Context comments are disabled by default for BigQuery connections because the context comments invalidate Google BigQuery’s ability to cache and can negatively impact cache performance. For BigQuery connections, it is recommended that you use job labels instead of SQL query comments. See the Job labels and context comments for BigQuery connections section on this documentation page for more information.
- Database Time Zone: The default time zone for BigQuery is UTC. The time zone setting you specify here needs to match your BigQuery time zone setting.
Job labels and context comments for BigQuery connections
For BigQuery connections, Looker sends query context in the form of BigQuery job labels. By default, Looker sends the following context label keys for BigQuery connections:
looker-context-user_id: The unique identifier for each user on the Looker instance. You can match this user ID to the user IDs on the Users page in the Admin menu.
looker-context-history_id: The unique identifier for each query that is run on the database by the Looker instance. This is the same history ID that is shown in the details of each query on the Queries page.
looker-context-instance_slug: The ID number of the Looker instance that issued the query. Looker support can use this information to help you troubleshoot, if necessary.
You can configure additional job labels for Looker to send with every query on the BigQuery connection by using the Additional Params text field of the Connections page. In the Additional Params field, add an additional JDBC parameter,
labels, and provide a comma-separated list of URL-encoded
key=value pairs. For example, if you include this in the Additional Params field:
%3D is the URL-encoding for
=, so this would add the following two labels to every query that Looker sends to the BigQuery database, in addition to the default Looker context labels:
Note that BigQuery has restrictions on job labels:
- Any connection label that has the same key as a context label will be ignored.
- If the union of connection labels and context labels exceeds the maximum of 64 total labels, context labels are the first to be dropped, followed by connection labels, until the total number of labels is at most 64.
Looker ensures that context labels conform to all BigQuery’s label validity requirements, but does not check connection labels for validity. Configuring invalid connection labels may cause queries to fail.
The BigQuery job labels that Looker sends by default (
looker-context-instance_slug) correspond to the SQL context comments that Looker attaches to SQL queries for database dialects other than BigQuery. For BigQuery connections, context comments are disabled by default because they invalidate BigQuery’s ability to cache, and can negatively impact cache performance. You can enable context comments for a BigQuery connection by deselecting the Disable Context Comment setting for the BigQuery connection. It is recommended that you keep the default setting for Disable Context Comment, so that you are able to use BigQuery’s cache. But if you deselect the Disable Context Comment option for a BigQuery connection, Looker will send SQL context comments and BigQuery job labels to your database. The following image shows a SQL Runner query on a BigQuery database where SQL context comments are enabled. You can see that Looker sends the database both the SQL context comments and the BigQuery job labels, and that they contain the same information:
OAuth for BigQuery connections
Looker supports OAuth for Google BigQuery connections, meaning that each Looker user authenticates in to Google with their own Google OAuth credentials and authorizes Looker to run queries on the database.
OAuth lets database administrators:
- Audit which Looker users are running queries against the database
- Enforce role-based access controls using Google permissions
- Use OAuth tokens for all processes and actions that access Google BigQuery, instead of embedding BigQuery IDs and passwords in multiple places
For BigQuery connections with OAuth:
- If a database administrator changes the BigQuery OAuth client credentials, any schedules or alerts that a user owns will be affected. Users must log in again if their administrator changes the BigQuery OAuth credentials. Users can also go to their Looker Account page from their user profile account page to log in to Google.
- Since BigQuery connections that use OAuth are “per user,” caching policies are also per user, and not just per query. So instead of using cached results whenever the same query is run within the caching period, Looker will use cached results only if the same user has run the same query within the caching period. For further information on caching, see the Caching queries and rebuilding PDTs with datagroups documentation page.
- Persistent derived tables (PDTs) are not supported.
- Admins, when they sudo as another user, will use that user’s OAuth authorization token. See the Users documentation page for information on using the
Configuring a BigQuery database project for OAuth
The following sections describe how to generate OAuth credentials and how to configure an OAuth consent screen. If you’ve already configured an OAuth consent screen for another application in your project, you won’t need to create another — you configure only one consent screen for all applications in a project.
OAuth credentials and the OAuth consent screen must be configured in the Google Cloud Platform Console. Google’s generic description is on the Google Cloud support site and on the Google Dev Console site.
Depending on the type of users accessing BigQuery data in Looker and whether your BigQuery data is public or private, OAuth may not be the most appropriate authentication method. Likewise, the type of data requested from the user and degree of access needed to that user’s data when they’re authenticating into Google to use Looker may require verification by Google. See more about verification in the Generating Google OAuth credentials section on this page.
Generating Google OAuth credentials
Go to the Google Cloud Platform Console.
In the Select a project drop-down, navigate to your BigQuery project. This should take you to your project dashboard.
In the left menu, select the APIs & Services page. Then click Credentials. On the Credentials page, click the down arrow in the Create credentials button, and select OAuth client ID from the drop-down menu:
Google requires that you configure an OAuth consent screen, which lets your users choose how to grant access to their private data, before you can generate your OAuth credentials. To configure your OAuth consent screen, see the Configuring an OAuth consent screen section on this page.
If you’ve already configured an OAuth consent screen, Google displays the Create OAuth client ID page, which lets you create an OAuth client ID and secret to use in your BigQuery connection to Looker:
Select Web application as the application type and, when the page expands, enter a name for the app, such as Looker, in the Name field.
https://. For example:
- If Looker hosts your instance:
- If you have a customer-hosted Looker instance:
- If your Looker instance requires a port number:
In the Authorized redirect URIs field, enter the URL to your Looker instance, followed by
/external_oauth/redirect. For example:
Click Create. Google displays your client ID and your client secret.
Copy your client ID and your client secret values — you will need them to configure the OAuth for the BigQuery connection in Looker.
Configuring an OAuth consent screen
If you’ve already configured an OAuth consent screen for another application in your project, you will not see this option. If you’ve already generated your OAuth credentials, you can proceed to configuring your BigQuery connection for OAuth in Looker.
In the left menu, select the OAuth consent screen page. Before you can configure your OAuth consent screen, you must choose the type of users to whom you’re making this app available. Depending on your selection, your app may require verification by Google.
Make your selection and click Create. Google displays the OAuth consent screen page:
You can configure this screen for all applications in your project, including both internal and public applications.
Google will perform a verification for public applications if any of these are true:
- The application uses Google APIs that use restricted or sensitive scopes.
- The OAuth consent screen includes an application logo.
- The project has exceeded the domain threshold.
In the Application name field, put the name of the application that the user is granting access to — in this case, Looker.
Enter the support email that users should contact with login issues.
Looker requires only the default scopes, so no additional scope configuration is required.
In the Authorized domains field, enter the domain of the URL to your Looker instance. For example, if Looker hosts your instance at
https://<instance_name>.looker.com, the domain is
looker.com. For customer-hosted Looker deployments, enter the domain on which you host Looker.
The remaining fields are optional but can be used to further customize your consent screen.
Once you’ve configured your OAuth consent screen, click Save.
For more information about configuring the Google OAuth consent screen, see Google’s support documentation.
You can now continue on in generating your OAuth credentials.
Configuring OAuth for a BigQuery connection
- In Looker, navigate to your BigQuery configuration page (from the Connections page in the Admin panel) or create a new BigQuery connection.
- Select the Use OAuth checkbox.
When you select Use OAuth, you will see the OAuth Client ID and OAuth Client Secret fields:
Paste in the client ID and your client secret values obtained from following the steps in the Generating Google OAuth credentials section on this page.
- Complete the rest of the procedure for Connecting Looker to your database.
- For a new connection, click the Add Connection button at the bottom of the Connection Settings page. If editing an existing BigQuery connection, click the Update Connection button at the bottom of the Connection Settings page.
- You can now test your connection as needed.
Testing the connection
- Click the Test These Settings button at the bottom of the Connections Settings page, as described on the Connecting Looker to your database documentation page.
- Click the Test button by the connection’s listing on the Connections admin page, as described on the Connections documentation page.
For new connections, if you see Can connect, then press Add connection. This runs the rest of the connection tests to verify that the service account was set up correctly and with the proper roles.
Testing a connection that uses OAuth
- In Looker, go into Development Mode.
- For an existing BigQuery connection that uses OAuth, navigate to the project files for a Looker project that uses your BigQuery connection. For new BigQuery connections that use OAuth, open a model file and replace the model’s
connectionvalue with the name of the your new BigQuery connection, then save the model file.
- Open one of the model’s Explores or dashboards and run a query. When you try to run a query, Looker will prompt you to log in with your Google account. Follow the Google OAuth login prompts.
Authenticating into Google to use a BigQuery connection
Users can perform their initial authentication by either:
- Authenticating into to Google from the query that uses the BigQuery connection
- Authenticating into to Google from the OAuth Connection Credentials section on their Account page
Authenticating into Google from a query
Once the BigQuery connection is set up for OAuth, users will be prompted to log in with their Google account before running queries that use that connection, including queries from Explores, dashboards, Looks, and SQL Runner. Here’s an example Explore that uses a BigQuery connection for which the user must log in:
Authenticating into Google from the user account page
To authenticate into your Google account from your user account page:
- Click your profile icon and select Account from the user menu.
- Scroll down to the OAuth Connection Credentials section and click the Log In button for the desired BigQuery database connection.
- Select the appropriate account from the Sign in with Google page.
- Click Allow on the OAuth consent screen to allow Looker to view and manage your data in Google BigQuery.
Once you authenticate into Google through Looker, you can log out or reauthorize your credentials at any time through your Account page, as described on the Personalizing your user account documentation page. Although Google BigQuery tokens do not expire, a user may click Reauthorize to log in with a different Google account.
Revoking OAuth tokens
Google BigQuery tokens do not expire; however, if a database admin changes the database connection’s OAuth credentials in a way that invalidates the existing credentials, users will have to log in with their Google account again before running any queries that use that connection.
Some schedules or alerts may be based on models with a BigQuery database connection using OAuth. If a BigQuery database admin changes the existing database connection’s OAuth credentials, existing users’ credentials may be invalidated. Any schedules or alerts created by that user will fail until the user logs in again through Google OAuth.
For Looker to support some features, your database dialect must also support them.
In Looker 22.6, Google BigQuery Legacy SQL supports the following Looker features:
In Looker 22.6, Google BigQuery Standard SQL supports the following Looker features:
After you’ve connected your database to Looker, configure sign-in options for your users.