Prior to starting a historical data migration, ensure you do the following:
- Create a project on our US or EU Cloud.
- Sign up to a paid product analytics plan on the billing page (historic imports are free but this unlocks the necessary features).
- Raise an in-app support request with the Data pipelines topic detailing where you are sending events from, how, the total volume, and the speed. For example, "we are migrating 30M events from a self-hosted instance to EU Cloud using the migration scripts at 10k events per minute."
- Wait for the OK from our team before starting the migration process to ensure that it completes successfully and is not rate limited.
- Set the
historical_migrationoption totruewhen capturing events in the migration.
Migrating data from Pendo is a two step process:
- Export data via Pendo Data Sync
- Convert Pendo data to the PostHog schema and capture in PostHog
1. Export data via Pendo Data Sync
Pendo Data Sync enables you to export data to a warehouse like S3, Azure, or Google Cloud. This requires their highest Ultimate tier of pricing. See their docs for details on how to set it up.
This exports event, features, guides, Pages, and more in a .avro format which we can then convert and capture into PostHog.
Want to make this guide better (and a $75 merch code)? We're looking for sample Pendo data from their Data Sync or aggregations API to improve this guide. Email
ian@posthog.comif you have access and are willing to share.
2. Convert Pendo data to the PostHog schema and capture in PostHog
The schema of Pendo's exported event data is similar to PostHog's schema, but it requires converting to work with the rest of PostHog's data. You can see details on Pendo's schema in their docs and events and properties PostHog autocaptures in our docs.
If you have done one single historical export, you can query the allevents.avro table to get the event data. With it, you can then go through each row and convert it to PostHog's schema. This requires converting:
- Event names like loadto$pageview.
- Properties like urlto$current_url
- Event browserTimestampto a ISO 8601 timestamp
Once this is done, you can capture the data into PostHog using the Python SDK or the capture API endpoint with historical_migration set to true. 
Here's an example version of a Python script reading from an allevents.avro file: