This guide walks you through the process of extracting and archiving Q&A pairs from a Copilot conversation β from saving your chat to generating a clean, shareable archive.
This guide walks you through the process of extracting and archiving Q&A pairs from a Copilot conversation β from saving your chat to generating a clean, shareable archive.
This script reads your saved copilot_conversation.html and splits it into clean Q&A chunks.
Create a file called chunker.py and paste in the following code:
from bs4 import BeautifulSoup
import os
INPUT_FILE = "copilot_conversation.html"
CHUNK_FOLDER = "copilot_chunks_html"
os.makedirs(CHUNK_FOLDER, exist_ok=True)
with open(INPUT_FILE, "r", encoding="utf-8") as f:
soup = BeautifulSoup(f, "html.parser")
user_blocks = soup.find_all("div", attrs={"data-content": "user-message"})
ai_blocks = soup.find_all("div", attrs={"data-content": "ai-message"})
chunks = []
for user, ai in zip(user_blocks, ai_blocks):
user_name = user.find_previous("div", class_="text-foreground-600").get_text(strip=True)
user_text = user.get_text(separator="\n", strip=True)
ai_text = ai.get_text(separator="\n", strip=True)
chunk = f"π§ {user_name}:\n{user_text}\n\nπ€ Copilot:\n{ai_text}"
chunks.append(chunk)
for i, chunk in enumerate(chunks, 1):
with open(f"{CHUNK_FOLDER}/chunk_{i:04}.txt", "w", encoding="utf-8") as f:
f.write(chunk)
print(f"β
Saved {len(chunks)} chunks to '{CHUNK_FOLDER}' folder.")
Then run the script in your terminal:
python chunker.py
This will create a folder called copilot_chunks_html containing files like:
chunk_0001.txtchunk_0002.txtThis script reads each chunk file and extracts:
Create a file called extract_qa.py and paste in the following code:
import os
import re
CHUNK_FOLDER = "copilot_chunks_html"
OUTPUT_FILE = "copilot_qa_archive.txt"
qa_entries = []
emoji_section_starters = (
"π", "π‘", "π", "β
", "π", "π", "π", "π", "π", "π", "π§", "π οΈ", "π§ ", "π¦", "π―", "π¨", "π§ͺ", "π¬",
"π§", "π₯", "π§΅", "π ", "β¨", "ποΈ", "β", "π§Ύ", "π’", "π§±", "π", "βοΈ", "π§Ά", "ποΈ", "π§°", "π§©", "π§Ό", "π§΄",
"π§¨", "π§", "π§", "π§Έ", "π§³", "π§Ί", "π§½", "π§―", "π§Ώ", "π", "π", "π", "π", "π", "π", "π", "π", "π",
"π", "π", "π", "π", "π"
)
def extract_first_paragraph(lines, start_index):
paragraph = []
for i, line in enumerate(lines[start_index:]):
stripped = line.strip()
if i == 0:
if stripped.lower() == "copilot said":
continue
paragraph.append(stripped)
continue
if (
stripped == "" or
stripped.startswith("π§") or
stripped.startswith("π€") or
re.match(r"^[-β’*#]+", stripped) or
re.match(r"^\d+\.", stripped) or
any(stripped.startswith(icon) for icon in emoji_section_starters)
):
break
paragraph.append(stripped)
return paragraph
for filename in sorted(os.listdir(CHUNK_FOLDER)):
if filename.endswith(".txt"):
with open(os.path.join(CHUNK_FOLDER, filename), "r", encoding="utf-8") as f:
lines = f.readlines()
user_line = None
user_text = []
copilot_line = None
copilot_text = []
for i, line in enumerate(lines):
if line.startswith("π§"):
user_line = line.strip()
for follow_line in lines[i+1:]:
if follow_line.startswith("π€"):
break
user_text.append(follow_line.strip())
break
for i, line in enumerate(lines):
if line.startswith("π€"):
copilot_line = line.strip()
copilot_text = extract_first_paragraph(lines, i + 1)
break
if user_line and copilot_line:
qa_entries.append(
f"--- {filename} ---\\n{user_line}\\n" +
"\\n".join(user_text) + "\\n\\n" +
f"{copilot_line}\\n" +
"\\n".join(copilot_text) + "\\n"
)
with open(OUTPUT_FILE, "w", encoding="utf-8") as f:
f.write("\\n".join(qa_entries))
print(f"β
Q&A archive saved to '{OUTPUT_FILE}'")
Then run the script:
python extract_qa.py
This will generate a file called copilot_qa_archive.txt in the same folder.
copilot_qa_archive.txt--- chunk_0001.txt --- π§ Alex: How can I improve my productivity when working from home? π€ Copilot: One of the most effective ways to boost productivity at home is to establish a consistent routine and dedicated workspace.
This entire workflow is designed to help you recover and reuse your Copilot conversations in a structured, portable format. By extracting clean Q&A pairs, you can:
This is especially useful when youβve had a long, rich conversation and want to preserve or build on it outside the chat interface.
<html> β Copy outerHTMLcopilot_conversation.htmlcopilot_chunks_html/chunk_0001.txt, chunk_0002.txt, etc.copilot_qa_archive.txtbeautifulsoup4chunker.py and extract_qa.py