{"slug": "linkedout-see-how-much-data-linkedin-has-on-you", "title": "Linkedout: See how much data LinkedIn has on you", "summary": "Developer Alex Ewerlöf created LinkedOut, a free open-source Chrome extension that reveals the extensive data LinkedIn collects on users, including 54,000 data points from his 16-year account history. The app allows users to browse their downloaded LinkedIn data privately, highlighting issues like IP logging, contact leaks, and GDPR data portability rights.", "body_md": "**TLDR; LinkedOut is a free open source app that allows you to browse the [shocking amount of] data LinkedIn gathers and stores on you.**\n\n[Link to the app](https://chromewebstore.google.com/detail/linkedout/dcbkghofdjbpmffcieifebmppeiedckc?authuser=0&hl=en-GB)(Chrome Extension)[Source code](https://github.com/alexewerlof/linkedout)(on Github)\n\n**This post describes why I built the app, what I found out, and how you can use it for yourself completely privately. At the end, as a bonus I also share the LLM setup for creating this kind of personal app on the cheap with high quality.**\n\n**Disclaimer: no generative AI is used in this post.**\n\n# Intro\n\nI’ve been on LinkedIn since 9:48 PM Oct 22, 2010. How do I know that? Because LinkedIn kept tabs:\n\nIn fact, during the past 16 years I’ve left 54k data points on LinkedIn:\n\n20k messages\n\n11k reactions\n\n4k connections\n\n3.7k comments\n\n2.4k searches\n\n421 ad clicks\n\n14 job applications with hundreds of questions I answered during the application process\n\nThere’s way more data (e.g. IP addresses and my phone contacts) but some are a bit creepy!\n\nApparently 12 years ago I allowed LinkedIn app to “sync my connection” with something which leaked 3.8k of my personal contact informations (emails, phone numbers, address, etc)! 🤯\n\nLinkedIn also stores the IP address of every login:\n\n…and verification:\n\nThen there are messages:\n\n11.1k received\n\n9.7k sent\n\nThis is the CSV (comma separated value) file you get:\n\n… and this is how the app renders it:\n\nThat’s pretty much the idea behind the app! I wanted to see how much of LinkedIn I can recreate from a 4MB zip file?\n\nAs it turns out a lot!\n\nI spent over 50 hours on it and burned more than $120 tokens to iterate through the UI. And since the “code is cheap” nowadays, you can just go grab [the code](https://github.com/alexewerlof/linkedout) and do whatever you want with it (MIT license).\n\nBut I figured not everyone is tech savvy or has the time or token budget to convert a zip to a full-blown website, so I published it as [a Chrome Extension that extracts the data in your own storage](https://chromewebstore.google.com/detail/linkedout/dcbkghofdjbpmffcieifebmppeiedckc?authuser=0&hl=en-GB) (indexed db). No data is gathered or transmitted. The whole thing is just an offline SPA (single page application).\n\n## How to get your data?\n\nLet’s step back and see how I got this information in the first place. According to EU regulations (GDPR) LinkedIn is legally accountable to give you a copy of your data:\n\n[Article 20]Right to data portabilityThe data subject shall have the right to receive the personal data concerning him or her, which he or she has provided to a controller, in a structured, commonly used and machine-readable format…\n\nCalifornia (CCPA/CPRA), Viginia (VCDPA), along with newer laws enacted through 2025/2026—all include a nearly identical “Right to Data Portability” clause as part of their broader consumer access rights.\n\nBrazil has LGPD, Canada has CPPA, India has DPDP… you get the picture.\n\nIf you’re remotely interested to the topic, you can just [go to LinkedIn and initiate the full data download](https://www.linkedin.com/help/linkedin/answer/a1339364/downloading-your-account-data). It’s free but takes a few days. Remember to request a **full download**:\n\nAs we’ll see, the data I got from LinkedIn is machine-readable but it has many problems.\n\n## Data problems\n\nThe .zip file contains many .csv files (and even some .html, if you’ve written articles on LinkedIn) but I encountered many small and big problems:\n\nIt takes up to 24 hours to get your data (mine is a 4MB zip file and took around 13 hours) and you have to act quick because the link stops working after 72 hours.\n\nIt breaks the CSV format making it tricky to parse with conventional libraries (see below for an example)\n\nIt’s inconsistent. For example date can be in any of these formats:\n\n`27 May 2026`\n\n`5/27/26, 1:56 AM`\n\n`2026-05-27 19:23:01 UTC`\n\netc.\n\nIt has duplicates. Lots of them:\n\nIt is missing critical information like:\n\nYour own media (pictures, presentations, videos) that are uploaded in posts. You can manually download them from LinkedIn\n\n[here](https://www.linkedin.com/mypreferences/d/rich-media).List of followers\n\n**Links** to[Events you participated](https://www.linkedin.com/mynetwork/network-manager/events/), Ads you clicked (although AI could guess the right link),[Groups](https://www.linkedin.com/groups/)you’re a member of,[Newsletters you’re subscribed to](https://www.linkedin.com/mynetwork/network-manager/newsletters/), companies you’ve worked for, universities you’ve attended, etc.I’m actually not that worried about what LinkedIn gives me, but what it holds back because clearly there’s more data on me that’s not in the zip file (read 7).\n\nSome data is cut off before a certain date (e.g. log in info is missing prior to 2 years ago or connection requests only go back 6 months). I don’t know whether LinkedIn doesn’t store them or just don’t export them.\n\nDo you remember those 3.8k private contact that LinkedIn somehow fetched? That’s gone from my latest data export!!! 🤔 I wonder if it’s just an honest mistake, or LinkedIn deleted that data (I didn’t) or just chose not to share it with the users anymore? Makes me wonder what else is there that I don’t get.\n\nI’ve been sitting on various versions of this data as far back as 2 years ago and things have changed. For example, lately, LinkedIn started postfixing some files (not all of them) with account number of sorts (e.g.\n\n`Reactions_9893749.csv`\n\ninstead of`Reactions.csv`\n\n, which makes my job harder to render the file)\n\nIf you’re into data science or AI, you know that data clean up is part of the job, but I’m baffled that a Microsoft company doesn’t have the resources to create a solid data download. Incentives! Incentives, baby! 😄\n\nMy understanding is that these pieces of data come from different sources but apparently there’s no Staff+ Engineer across teams to ensure data consistency.\n\nWhat bothers me isn’t the data they’ve shared but what they **don’t** share. As I mentioned, there are some known unknowns (stuff that I know they have like the images on my posts) and there are some unknown unknowns (like: where did I use LinkedIn to login to a job application site or what data does LinkedIn share with other Microsoft companies).\n\n# Screenshots\n\nI was determined to create an “offline LinkedIn” as much as possible so there was a lot of trial and error and iteration involved to get to this:\n\n## Tech Stack\n\nReact + Typescript + ESLint\n\nTailwind + Daisy UI + Lucide icons\n\nVite + Vite-test\n\nI used a mix of GPT 5.4, Gemma 4 12B and DeepSeek V4 Pro to iterate though it.\n\n## Bonus: The AI setup\n\nGithub’s June price hike severely hurt my flow because I pay everything from my own pocket for this kind of hobby project.\n\nI’ve previously shared how I used LM Studio to use local models:\n\nHowever, as I continued experimenting, I landed on a setup that works much better:\n\nI purchased an API key directly from\n\n[https://www.deepseek.com/en/](https://www.deepseek.com/en/).Since DeepSeek doesn’t have vision capabilities, I ended up using an open source VS Code extension called\n\n[vizards.deepseek-v4-for-copilot](https://github.com/Vizards/deepseek-v4-for-copilot)which can use a local model to mitigate that using another model.I run Gemma 4 12B for its excellent vision capabilities on a local Mac mini M4 16B.\n\nHere’s how it works:\n\nInstead of adding a custom endpoint in VS Code (because it doesn’t support DeepSeek natively), I installed that extension which takes the Deepseek token and provides the LLM service to the VS Code Copilot harness using the\n\n[Language Model Chat Provider API](https://code.visualstudio.com/api/extension-guides/ai/language-model-chat-provider).The extension doesn’t tell VS Code that DeepSeek doesn’t have vision capabilities. Instead, if you attach a snapshot (which is an absolute must for this kind of UI-heavy application), it calls another model to analyze the picture. The extension is not aware of llama.cpp, so you have to add your model using\n\n[VS Code’s custom endpoint feature](https://code.visualstudio.com/blogs/2026/06/18/byok-vscode)in advance.\n\nThe reason I went with Gemma 4 12B is its native vision capabilities and the fact that it perfectly fits into a cheam Mac Mini M4 16G I have laying around. You can use any other model or even cloud models.\n\nTo sum up:\n\nCloud AI for coding: DeepSeek for its great performance and reasoning\n\nLocal AI for vision: Gemma 4 12B due to its native vision capabilities and cheap price (practically free although I have to pay for the electricity on that efficient Mac mini which uses up to 60W under load).\n\nEarlier I mentioned that I burned $120 tokens but that’s primarily because:\n\nI started this project a few months back using Copilot Pro ($10/mo), then upgraded to Pro Plus ($40/mo).\n\nAfter the price hike I switched to OpenRouter (pay as you go) and tried different models to land on Deepseek. The price dropped a bit but not as much as I hoped for because OpenRouter sent my DeepSeek requests to a bunch of different providers that were much more expensive that Deepseek’s list price.\n\nEventually I landed on using DeepSeek directly thanks to a comment on my post and then added the open source Gemma 4 12B on top of it. So I bite the bullet and gave my credit card info to a chinese site (created a temporary virtual card of course, but it has my name on it). Trade-off accepted!\n\nGoing forward, it’ll be much cheaper to develop. I put $10 in my Deepseek account and still got some leftover:\n\n⚠️ You need to beware that DeepSeek may also log your requests for training. I developed this app using my personal data and to prevent leakage, I did the data anonymization part completely using local models (which aren’t that smart so it took quite some trial and error and manual scripts).\n\nSpecifically, I tasked my local LLM to go through my actual LinkedIn .zip file and create an anonymized [test-data.zip](https://github.com/alexewerlof/linkedout/blob/master/test-data.zip) representing the quirks of the original LinkedIn export data.\n\nThat is not to say the American LLM providers are saints. I’m just trying to raise awareness that when working with cloud models you should be extremely cautious what you sahre because once the data is out, you have very little control over its fate.\n\nRegardless, if you want to help me get some of that money back, I appreciate a paid sub. 🙏 I’m an indie developer spending my personal time and private money to make this.\n\nAlso, I appreciate if you could [share this on Hackernews](https://news.ycombinator.com/submitlink?u=https%3A%2F%2Fblog.alexewerlof.com%2Fp%2Flinkedout&t=LinkedOut) or discuss it there.\n\n# Conclusion\n\nThis was an interesting experiment. I was primarily solving an actual issue I have:\n\nHow do I find my old posts? Because LinkedIn search is useless.\n\nBut then I got curious to see how much of LinkedIn I can reconstruct completely offline. This is the kind of use case that LLM coding enable and I’m happy with the results. Since there are no servers and the app is completely boxed inside Chrome, I’m not too concerned with the quality of the code behind the app. But if it was a paid service I’d definitely spend 10x more time on it and go through every diff to make sure I can stand behind it.\n\nSpeaking of which, I believe now that the code is cheap, SaaS companies sell [SLA](https://blog.alexewerlof.com/p/sla)s which are essentially guarantees backed by legal contracts. That’s the difference between buying a DYI kit to renovate your bathroom vs bringing in the expensive professionals.\n\nMy take after all of this is:\n\nLinkedIn (like any other social media) gathers a lot of data on you. It’s one thing to know, another thing to actually see! This app helps you see (part of it at least).\n\nYou can only hope that they don’t feed it to some advanced algorithm to know you better than you do (hint: my messages contains information I trust the other party to read but obviously LinkedIn as the middleman has access to all of that information).\n\nLinkedIn doesn’t share everything which makes them legally accountable but I’m not a lawyer, nor do I have time to sue them.\n\n[My monetization strategy](https://blog.alexewerlof.com/p/faq#%C2%A7payment) is to give away most content for free but these posts take anywhere from a few hours to a few days to draft, edit, research, illustrate, and publish. I pull these hours from my private time, vacation days and weekends. The simplest way to support this work is to **like**, **subscribe** and **share** it. If you really want to support me lifting our community, you can consider a paid subscription. If you want to save, you can get 20% off via [this link](https://blog.alexewerlof.com/protipsdiscount). As a token of appreciation, subscribers get full access to the Pro-Tips sections and my online book [Reliability Engineering Mindset](https://blog.alexewerlof.com/p/rem). Your contribution also funds my open-source products like [Service Level Calculator](https://slc.alexewerlof.com/). You can also [invite your friends](https://blog.alexewerlof.com/leaderboard) to gain free access or save via a [group subscription](https://blog.alexewerlof.com/subscribe?group=true).\n\n*And to those of you who already support me, thank you for sponsoring this content for the others. 🙌 If you have questions or feedback, or you want me to dig deeper into something, please let me know in the comments.*", "url": "https://wpnews.pro/news/linkedout-see-how-much-data-linkedin-has-on-you", "canonical_source": "https://blog.alexewerlof.com/p/linkedout", "published_at": "2026-06-29 08:18:49+00:00", "updated_at": "2026-06-29 08:28:45.851692+00:00", "lang": "en", "topics": ["ai-tools", "developer-tools", "ai-ethics"], "entities": ["LinkedIn", "Alex Ewerlöf", "LinkedOut", "Chrome", "GDPR", "CCPA", "LGPD"], "alternates": {"html": "https://wpnews.pro/news/linkedout-see-how-much-data-linkedin-has-on-you", "markdown": "https://wpnews.pro/news/linkedout-see-how-much-data-linkedin-has-on-you.md", "text": "https://wpnews.pro/news/linkedout-see-how-much-data-linkedin-has-on-you.txt", "jsonld": "https://wpnews.pro/news/linkedout-see-how-much-data-linkedin-has-on-you.jsonld"}}