Skype / Microsoft Teams Export Importer

Imports a Microsoft Skype or Microsoft Teams (free / personal) data export (messages.json, endpoints.json, invites.json) into a MongoDB database so the content can be browsed, queried, and searched with any Mongo client (MongoDB Compass, mongosh, Studio 3T, etc.).

Both products share the same export format — consumer Skype was merged into Teams, and the "Download your data" flow produces the same JSON schema for either. Conversations from both apps coexist in a single export:

@thread.skype, @cast.skype — classic Skype group chats
@thread.v2, uni01_...@thread.v2, meeting_...@thread.v2 — Microsoft Teams chats, meetings, and communities

The messages.json file can be very large (tens to hundreds of MB) and is painful to open in a text editor. This script streams it incrementally — it never loads the full file into memory.

Scope — "Chat history" export only. Microsoft's Export my data page offers two export types: Chat history and Media. This tool only processes the Chat history bundle (messages.json, endpoints.json, invites.json). Images, videos, and other attachments from the Media export are out of scope — message documents will still reference them (via amsreferences or <URIObject> blobs in content), but the binary files themselves are not imported.

What gets imported

The script populates five collections in the target database (skype_export by default):

Collection	Source	`_id`	Description
`conversations`	`messages.json`	conversation id	Conversation metadata (display name, thread properties, member count, etc.)
`messages`	`messages.json`	`${conversationid}:${id}`	One document per message, linked to its conversation via `conversationid`
`endpoints`	`endpoints.json`	`endpointId`	Device/transport records (TROUTER, FCM, APNs, etc.)
`invites`	`invites.json` → `conversations`	conversation id	Per-conversation invite link + history
`user`	`invites.json` → `user`	`"invites"`	User-level invite link, history, community notification settings

Indexes created on messages:

{ conversationid: 1, originalarrivaltime: -1 } — list a chat's messages newest-first
{ messagetype: 1 } — filter by type (RichText, RichText/Html, ThreadActivity/AddMember, …)
{ displayName: 1 } — find messages from a particular sender
{ content: "text", displayName: "text" } (named content_text) — full-text search over message bodies and sender names, usable immediately with db.messages.find({ $text: { $search: "..." } })

Nested stringified JSON in threadProperties (members, membersBlocked, membersNicknames) is parsed into real arrays on the way in.

Data model

Each export file is broken apart and flattened into its own collection. The top-level envelope of messages.json (userId, exportDate, conversations: [...]) and of invites.json (user, conversations) is discarded — the useful content lives one level down.

`conversations`

One document per chat. The original MessageList array is stripped out (those become rows in messages) and replaced with a messageCount integer.

{
  "_id": "19:<thread-hash>@thread.skype",
  "id":  "19:<thread-hash>@thread.skype",
  "displayName": "<group name>",
  "version": 1700000000000,
  "properties": {
    "conversationblocked": false,
    "lastimreceivedtime": "2026-01-01T00:00:00.000Z",
    "consumptionhorizon": "<numeric;numeric;numeric>",
    "onetoonev2threadid": null
  },
  "threadProperties": {
    "membercount": 3,
    "members": ["<member 1>", "<member 2>", "<member 3>"], // parsed from a JSON string
    "topic": "<topic>",
    "picture": null,
    "description": null
  },
  "messageCount": 1234
}

`messages`

One document per message. Linked to conversations via conversationid. _id is ${conversationid}:${id} so re-imports never create duplicates and the same doc is addressable across runs.

{
  "_id":            "19:<thread-hash>@thread.skype:<message-id>",
  "id":             "<message-id>",
  "conversationid": "19:<thread-hash>@thread.skype",
  "displayName":    "<sender display name>",
  "originalarrivaltime": "2024-01-01T12:00:00.000Z",
  "messagetype":    "RichText",
  "version":        1700000000000,
  "content":        "<message body — plain text or HTML depending on messagetype>",
  "from":           null,
  "properties": {
    "s2spartnername": "chat-service-v1",
    "importedBy": { "Prefix": "8", "Network": "live", "RawValue": "8:live:<user>" },
    "importedTime": "2025-01-01T00:00:00.000Z"
  },
  "amsreferences": null
}

Common messagetype values you'll see:

Type	Meaning
`RichText`	Plain text message
`RichText/Html`	Formatted (HTML) message
`RichText/Media_GenericFile`	File attachment — `content` is a `<URIObject>` XML blob
`RichText/UriObject`	Image/photo attachment
`ThreadActivity/AddMember`	System event: someone joined the chat
`ThreadActivity/DeleteMember`	System event: someone left/was removed
`ThreadActivity/HistoryDisclosedUpdate`	System event: history visibility toggled
`ThreadActivity/TopicUpdate`	System event: topic changed

`endpoints`

One document per registered endpoint/device. Almost a 1:1 copy of each endpoints[i] from endpoints.json, with _id set to endpointId.

{
  "_id":         "<endpoint-uuid>",
  "endpointId":  "<endpoint-uuid>",
  "aadDeviceId": null,
  "nodeId":      null,
  "timestamp":   "2026-01-01T00:00:00.0000000Z",
  "transports": {
    "transports": [
      { "transportType": "TROUTER", "path": "https://<trouter-host>/...", "contexts": ["MESSAGING"], "isDeleted": false },
      { "transportType": "FCM",     "path": "<fcm-token>",                "contexts": ["TFL"],       "isDeleted": false }
    ]
  }
}

`invites`

One document per conversation that has an invite link (from invites.json → conversations). _id is the conversation id, so you can join against conversations on _id.

{
  "_id":            "19:<thread-hash>@thread.skype",
  "conversationId": "19:<thread-hash>@thread.skype",
  "inviteLink": "https://teams.live.com/l/invite/<token>",
  "inviteLinkHistory": [
    { "inviteLink": "https://teams.live.com/l/inv<old-token-1>", "createdOn": "2026-01-01T00:00:00.0000000Z" },
    { "inviteLink": "https://teams.live.com/l/inv<old-token-2>", "createdOn": "2025-12-01T00:00:00.0000000Z" }
  ]
}

`user`

A single-document collection holding the user-level settings from invites.json → user. Stored as _id: "invites" so it's easy to find.

{
  "_id": "invites",
  "communityNotifications": {
    "inviteOnNetworkEmailOptIn": true,
    "announcementEmailOptIn": true
  },
  "inviteLink": "https://teams.live.com/l/invite/<token>",
  "inviteLinkHistory": []
}

Relationships at a glance

conversations._id  ◀──────────  messages.conversationid
conversations._id  ◀──────────  invites._id

There's no server-side foreign-key enforcement — these are plain documents — but the _id scheme makes $lookup joins trivial if you want them.

Requirements

Node.js ≥ 20.6 (required for the built-in --env-file flag and top-level await). Tested on Node 22.
A reachable MongoDB instance — local, Docker, or Atlas (mongodb+srv://...).

Check your Node version:

node --version

If you need to upgrade, use nvm:

nvm install 22
nvm use 22

Setup

npm install
cp .env.example .env   # then edit .env if your Mongo lives elsewhere

.env supports:

Variable	Default	Description
`MONGO_URI`	`mongodb://localhost:27017`	MongoDB connection string
`MONGO_DB`	`skype_export`	Database name
`EXPORT_DIR`	script directory	Directory containing the three export JSON files

Usage

Place messages.json, endpoints.json, and invites.json next to import.js (or point EXPORT_DIR at them), then run:

npm run import

Before touching MongoDB the script peeks at the first 4 KB of messages.json to make sure it looks like a Skype/Teams chat-history export (contains userId, exportDate, conversations at the top level). If the file is missing or the shape is wrong the run aborts immediately — protection against accidentally pointing EXPORT_DIR at the wrong file and wiping the DB.

By default the script runs in wipe-and-insert mode, so it shows a destructive-action warning:

⚠  DESTRUCTIVE  These collections will be WIPED and re-imported:
   db:          skype_export
   uri:         mongodb://localhost:27017
   collections: conversations, messages, endpoints, invites, user
   Any data manually added to these collections will be lost.
   (Tip: pass --upsert for a non-destructive re-import.)

Type 'yes' to continue:

Only yes / y proceeds. Flags:

Flag	Effect
`--upsert`	Non-destructive re-import. Uses `bulkWrite` with `upsert: true`, so existing docs are replaced by `_id` and anything you added manually (tags, notes, extra fields on your own docs) survives.
`--yes`/`-y`	Skip the interactive prompt — useful for scripted runs.

Examples:

npm run import                  # wipe + re-import, with confirmation
npm run import -- --upsert      # non-destructive re-import, with confirmation
npm run import -- --upsert -y   # non-destructive, no prompt

Running the script multiple times is safe in either mode — _ids are deterministic so upsert runs stay idempotent.

Browsing the data

Once imported, connect with any Mongo client. Some useful queries:

// 1-to-1 and group chat list sorted by most recent activity
db.conversations.find().sort({ version: -1 }).limit(50);

// All real chat messages in a given conversation, newest first
db.messages
  .find({
    conversationid: "<paste a conversation _id here>",
    messagetype: { $in: ["RichText", "RichText/Html"] },
  })
  .sort({ originalarrivaltime: -1 });

// Everything sent by a specific person
db.messages.find({ displayName: "<sender display name>" }).sort({ originalarrivaltime: -1 });

// Full-text search (the importer already created the text index)
db.messages.find({ $text: { $search: "<keyword>" } });

Files

import.js — the streaming importer
package.json — dependencies (mongodb, stream-json) and the import script
.env.example — template for connection settings
.gitignore — keeps node_modules/, .env, and any *.json data dumps out of git

Notes on privacy

The exports contain personal chat history. The default .gitignore excludes *.json (other than package.json / package-lock.json) so you don't accidentally commit them. Double-check before pushing to any public remote.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Skype / Microsoft Teams Export Importer

What gets imported

Data model

`conversations`

`messages`

`endpoints`

`invites`

`user`

Relationships at a glance

Requirements

Setup

Usage

Browsing the data

Files

Notes on privacy

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
import.js		import.js
package-lock.json		package-lock.json
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

Skype / Microsoft Teams Export Importer

What gets imported

Data model

conversations

messages

endpoints

invites

user

Relationships at a glance

Requirements

Setup

Usage

Browsing the data

Files

Notes on privacy

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`conversations`

`messages`

`endpoints`

`invites`

`user`

Packages