This is the multi-page printable view of this section. Click here to print.
Learning
1 - Configuring
What You’ll Learn
How to configure Learning in Foyle to continually learn from human feedback
How It Works
- As you use Foyle, the AI builds a dataset of examples (input, output)
- The input is a notebook at some point in time ,
t
- The output is one more or cells that were then added to the notebook at time
t+1
- Foyle uses these examples to get better at suggesting cells to insert into the notebook
Configuring RAG
Foyle uses RAG to improve its predictions using its existing dataset of examples. You can control
the number of RAG results used by Foyle by setting agent.rag.maxResults
.
foyle config set agent.rag.maxResults=3
Disabling RAG
RAG is enabled by default. To disable it run
foyle config set agent.rag.enabled=false
To check the status of RAG get the current configuration
foyle config get
Sharing Learned Examples
In a team setting, you should build a shared AI that learns from the feedback of all team members and assists all members. To do this you can configure Foyle to write and read examples from a shared location like GCS. If you’d like S3 support please vote up issue #153.
To configure Foyle to use a shared location for learned examples
Create a GCS bucket to store the learned examples
gsutil mb gs://my-foyle-examples
Configure Foyle to use the GCS bucket
foyle config set learner.exampleDirs=gs://${YOUR_BUCKET}
Optionally you can configure Foyle to use a local location as well if you want to be able to use the AI without an internet connection.
foyle config set learner.exampleDirs=gs://${YOUR_BUCKET},/local/training/examples
2 - Troubleshoot Learning
What You’ll Learn
- How to ensure learning is working and monitor learning
Check Examples
If Foyle is learning there should be example files in ${HOME}/.foyle/training
ls -la ~/.foyle/training
The output should include example.binpb
files as illustrated below.
-rw-r--r-- 1 jlewi staff 9895 Aug 28 07:46 01J6CQ6N02T7J16RFEYCT8KYWP.example.binpb
If there aren’t any then no examples have been learned.
Trigger Learning
Foyle’s learning is triggered by the following sequence of actions:
- Foyle generates a suggested cell which is added to the notebook as a Ghost Cell
- You accept the suggested cell by putting the focus on the cell
- You edit the cell
- You execute the cell
When you execute the cell, the execution is logged to Foyle. For each executed cell Foyle checks
- Was that cell generated by Foyle
- If the cell was generated by Foyle did the actual command executed differ from the suggested command
- If the cell was changed by the user than Foyle attempts to learn from that execution
Crucially, every cell created by Foyle is assigned an ID. This ID can be used to track how the cell was generated and if learning occurred.
To get the cell ID for a given cell
- Open the raw markdown file by right clicking on it in VSCode and selecting
Open With
->Text Editor
- Find code block containing your cell
- Your cell will contain metadata which contains the ID e.g.
` ` `bash {"id":"01J6DG428ER427GJNTKC15G6JM"}
echo hello world
` ` `
Did Block Logs Get Created
- Get the block logs for the cell
- Change the cell ID to the ULID of the cell (you can view this in the markdown)
- The cell should be one that was generated by the AI and you think learning should have occurred on
CELLID=01J7S3QZMS5F742JFPWZDCTVRG
curl -X POST http://localhost:8877/api/foyle.logs.LogsService/GetBlockLog -H "Content-Type: application/json" -d "{\"id\": \"${CELLID}\"}" | jq .
- If this returns not found then no log was created for this cell and there is a problem with Log Processing
- The correct output should look like the following
{
"blockLog": {
"id": "01J7KQPBYCT9VM2KFBY48JC7J0",
"genTraceId": "0376c6dc6309bcd5d61e7b56e41d6411",
"doc": {
...
},
"generatedBlock": {
"kind": "CODE",
"language": "bash",
"contents": "jq -c 'select(.severity == \"error\" or .level == \"error\")' ${LASTLOG}",
"id": "01J7KQPBYCT9VM2KFBY48JC7J0"
},
"executedBlock": {
"kind": "CODE",
"contents": "CELLID=01J7KQPBYCT9VM2KFBY48JC7J0\ncurl -X POST http://localhost:8877/api/foyle.logs.LogsService/GetBlockLog -H \"Content-Type: application/json\" -d \"{\\\"id\\\": \\\"${CELLID}\\\"}\" | jq .",
"id": "01J7KQPBYCT9VM2KFBY48JC7J0"
},
"resourceVersion": "34d933d8-abe6-4ad3-b9cf-5a2392f34abb"
}
}
Notably the output should include the following fields
- generatedBlock - This is the block that was generated by the AI
- executedBlock - This is the block that the user actually executed
If the generatedBlock and executedBlock are the same then no learning occured
If the generatedBlock is missing then this means the block wasn’t generated by Foyle and learning won’t occur
- This can happen if you insert a blank cell and manually enter a command
If the executedBlock is missing then this means the block wasn’t executed and learning won’t occur
Was a cell executed?
- If a block is missing the executedBlock then we should check the logs to see if there is an event for cell execution
export LASTLOG=~/.foyle/logs/raw/$(ls -t ~/.foyle/logs/raw | head -n 1 )
echo "Last log file: ${LASTLOG}"
jq -c "select(.selectedCellId == \"01J7KQPBYCT9VM2KFBY48JC7J0\")" ${LASTLOG}
- If there are no execution events then the cell was never executed
- If you executed the cell but there are no log events then there is most likely a bug and please open an issue in GitHub
Check the logs associated with that cell
- We can search for all logs associated with that cell
export LASTLOG=~/.foyle/logs/raw/$(ls -t ~/.foyle/logs/raw | head -n 1 )
echo "Last log file: ${LASTLOG}"
jq -c "select(.blockId == \"${CELLID}\")" ${LASTLOG}
- Check for any errors processing the block
- Note that the above command will only process the most recent log file
- Each time Foyle is restarted it will create a new log file.
Did we try to create an example from any cells?
- If Foyle tries to learn from a cell it logs a message here
- We can query for that log as follows
jq -c 'select(.message == "Found new training example")' ${LASTLOG}
- If that returns nothing then we know Foyle never tried to learn from any cells
- If it returns something then we know Foyle tried to learn from a cell but it may have failed
- If there is an error processing an example it gets logged here
- So we can search for that error message in the logs
jq -c 'select(.level == "Failed to write example")' ${LASTLOG}
jq -c 'select(.level == "error" and .message == "Failed to write example")' ${LASTLOG}
Ensure Block Logs are being created
- The query below checks that block logs are being created.
- If no logs are being processed than there is a problem with the block log processing.
jq -c 'select(.message == "Building block log")' ${LASTLOG}
Are there any errors in the logs
- The query below should show you any errors in the logs.
jq -c 'select(.severity == "error")' ${LASTLOG}
Check Prometheus counters
Check to make sure blocks are being enqueued for learner processing
curl -s http://localhost:8877/metrics | grep learner_enqueued_total
- If the number is 0 please open an issue in GitHub because there is most likely a bug
Check the metrics for post processing of blocks
curl -s http://localhost:8877/metrics | grep learner_blocks_processed
- The value of
learner_blocks_processed{status="learn"}
is the number of blocks that contributed to learning - The value of
learner_blocks_processed{status="unexecuted"}
is the number of blocks that were ignored because they were not executed