These pages describe how to use RunMe and Foyle to interact with various systems.
1 - BigQuery
Why Use Foyle with BigQuery
If you are using BigQuery as your data warehouse then you’ll want to query that data. Constructing the right query is often a barrier to leveraging that data. Using Foyle you can train a personalized AI assistant to be an expert in answering high level questions using your data warehouse.
Prerequisites
Install the Data TableRenderers extension in vscode
- This extension can render notebook outputs as tables
- In particular, this extension can render JSON as nicely formatted, interactive tables
How to integrate BigQuery with RunMe
Below is an example code cell illustrating the recommended pattern for executing BigQuery queries within RunMe.
cat <<EOF >/tmp/query.sql
SELECT
DATE(created_at) AS date,
COUNT(*) AS pr_count
FROM
\`githubarchive.month.202406\`
WHERE
type = "PullRequestEvent"
AND
repo.name = "jlewi/foyle"
AND
json_value(payload, "$.action") = "closed"
AND
DATE(created_at) BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 28 DAY) AND CURRENT_DATE()
GROUP BY
date
ORDER BY
date;
EOF
export QUERY=$(cat /tmp/query.sql)
bq query --format=json --use_legacy_sql=false "$QUERY"
As illustrated above, the pattern is to use cat
to write the query to a file. This allows
us to write the query in a more human readable format. Including the entire SQL query in the code
cell is critical for enabling Foyle to learn the query.
The output is formatted as JSON. This allows the output to be rendered using the Data TableRenderers extension.
Before executing the code cell click the configure button in the lower right hand side of the cell and then uncheck the box under “interactive”. Running the cell in interactive mode prevents the output from being rendered using Data TableRenders. For more information refer to the RunMe Cell Configuration Documentation.
We then use bq query
to execute the query.
Controlling Costs
BigQuery charges based on the amount of data scanned. To prevent accidentally running expensive queries you can
use the --maximum_bytes_billed
to limit the amount of data scanned. BigQuery currently charges
$6.25 per TiB.
Troubleshooting
Output Isn’t Rendered Using Data TableRenderers
If the output isn’t rendered using Data TableRenderers there are a few things to check
Click the ellipsis to the left of the the upper left hand corner and select change presentation
- This should show you different mime-types and options for rendering them
- Select Data table
Another problem could be that
bq
is outputting status information while running the query and this is interfering with the rendering. You can work around this by redirecting stderr to/dev/null
. For example,bq query --format=json --use_legacy_sql=false "$QUERY" 2>/dev/null
Try explicitly configuring the mime type by opening the cell configuration and then
- Go to the advanced tab
- Entering “application/json” in the mime type field
2 - Honeycomb
Why Use Honeycomb with Foyle
Honeycomb is often used to query and visualize observability data.
Using Foyle and RunMe you can
- Use AI to turn a high level intent into an actual Honeycomb query
- Capture that query as part of a playbook or incident report
- Create documents that capture the specific queries you are interested in rather than rely on static dashboards
How to integrate Honeycomb with RunMe
This section explains how to integrate executing Honeycomb queries from RunMe.
Honeycomb supports embedding queries directly in the URL. You can use this feature to define queries in your notebook and then generate a URL that can be opened in the browser to view the results.
Honeycomb queries are defined in JSON. A suggested pattern to define queries in your notebook are
- Create a code cell which uses contains the query to be executed
- To make it readable the recommended pattern is to treat it as a multi-line JSON string and write it to a file
- Use the CLI hccli to generate the URL and open it in the browser
- While the CLI allows passing the query as an argument this isn’t nearly as human readable as writing it and reading it from a temporary file Here’s an example code cell
cat << EOF > /tmp/query3.json
{
"time_range": 604800,
"granularity": 0,
"calculations": [
{
"op": "AVG",
"column": "prediction.ok"
}
],
"filters": [
{
"column": "model.name",
"op": "=",
"value": "llama3"
}
],
"filter_combination": "AND",
"havings": [],
"limit": 1000
}
EOF
hccli querytourl --query-file=/tmp/query3.json --base-url=https://ui.honeycomb.io/YOURORG/environments/production --dataset=YOURDATASET --open=true
- This is a simple query to get the average of the
prediction.ok
column for thellama3
model for the last week - Be sure to replace
YOURORG
andYOURDATASET
with the appropriate values for your organization and dataset- You can determine
base-url
just by opening up any existing query in Honeycomb and copying the URL
- You can determine
- When you execute the cell, it will print the query and open it in your browser
- You can use the share query feature to encode the query in the URL
Training Foyle to be your Honeycomb Expert
To train Foyle to be your Honeycomb you follow these steps
Make sure you turned on learning by following the guide
Create a markdown cell expressing the high level intent you want to accomplish
Ask Foyle to generate a completion
- The first time you do this Foyle will probably provide a widely inaccurate answer because it has no prior knowledge of your infrastructure
- Edit one of the generated code cells to contain the Honeycomb query and
hccli
command - Execute the cell to generate the URL
Each time you ask for a completion and edit and execute the cell you are providing feedback to Foyle
- Foyle uses the feedback to learn and improve the completions it generates
Important When providing feedback to Foyle its important to do so by editing a code cell that was generated by Foyle. If you create a new code cell Foyle won’t be able to learn from it. Only cells that were generated by Foyle are linked to the original query.
If you need help boostrapping some initial Honeycomb JSON queries you can use Honeycomb’s Share feature to generate the JSON from a query constructed in the UI.
Producing Reports
RunMe’s AutoSave Feature creates a markdown document that contains the output of the code cells. This is great for producing a report or writing a postmortem. When using this feature with Honeycomb you’ll want to capture the output of the query. There are a couple ways to do this
Permalinks
To generate permalinks, you just need to use can use start_time
and
end_time
to specify a fixed time range for your queries (see Honeycomb query API). Since the
hccli prints the output URL it will be saved in the session outputs generated
by RunMe. You could also copy and past the URL into a markdown cell.
Grabbing a Screenshot
Unfortunately a Honeycomb enterprise plan is required to access query results and graphs via API. As a workaround, hccli supports grabbing a screenshot of the query results using browser automation.
You can do this as follows
Restart chrome with a debugging port open
chrome --remote-debugging-port=9222
Login into Honeycomb
Add the
--out-file=/path/to/file.png
tohccli
to specify a file to save it tohccli querytourl --query-file=/tmp/query3.json --base-url=https://ui.honeycomb.io/YOURORG/environments/production --dataset=YOURDATASET --open=true --out-file=/path/to/file.png
Warning Unfortunately this tends to be a bit brittle for the following reasons
You need to restart chrome with remote debugging enabled
hccli
will use the most recent browser window so you need to make sure your most recent browser window is the one with your Honeycomb credentials. This may not be the case if you have different accounts logged into different chrome sessions