All About OSINT Matters Things in OSINT
How to start gather information in OSINT something you should know and understand

Several stages in the OSINT search. Things to know and do:
Know your search object, define variables for the search object e.g. name, company, information to search for
Perform multiple data searches, such as you can do dorking or scrapping for data searches. You can also utilize data brokers to complete your data. After that, collect your data and prepare some comparative data as well as save some existing evidence
Checking the information, is the object you are looking for correct? Chek some other variables such as, close friends for example or other information contained with your target, look and observe
Validate, check some of the data you find for example from satellite data, existing datasets, sensor data and make sure there are 5W + 1H elements to get the essence and differentiation of the data object you are looking for, observe to make sure there are no differences and data manipulation (BIAS)
If at the search stage you don't find or don't know the information, try checking back from the beginning of the sequence. For example, there is unique data that you found at the beginning of the search such as phone number, email, username, user id, region or country, height and other things, if you have found this unique data. Then try to search and narrow down the data based on that unique data, try searching in several data brokers or scrapping searches to find that unique data. After that, parse your data again, and look at the similarities and differences, after that analyze and observe again
Once you have finished analyzing then, the mandatory thing you do is archive, this is very important for evidence of the data you collect. You can create your own program to archive or use a 3rd party to archive your data
After that you make a report, make sure the report has an abstract, context, and the data you found and then your analysis, what do you analyze? What do you have? What do you know? What evidence did you get, the conclusion of your analysis
As an OSINT investigator or practitioner, there must be something that will make it difficult to search, for example, constraints in language, geography, platforms and sock accounts that must be prepared such as telephone numbers from related countries, addresses, and E KYC or KYC of related countries and there are also rules from a country on the internet, search regulations in each search engine in a particular country and applicable laws. From what I know and my experience here are the points that you should know
- Legit Check
As an OSINT practitioner or investigator, the most important thing is the accuracy and authenticity of the data. I often find news or news portals that fulfill clickbait or headlines that contain hyperbole, when accessed and read there are no important points taken and do not get answers and conclude in depth. As an OSINT analyst, we are required to read news from many sources or look for data from many sources to compare data, and take important points such as 5W + 1H, take existing evidence and verify data to ensure data accuracy and store evidence, conduct data mapping and sentiment analysis to strengthen the data we will collect and present
- Propaganda
Be careful with propaganda or hoaxes, a lot of posts, especially on social media such as Facebook and Tiktok, contain a false narrative, deepfake and opinion inflation. Ordinary people who do not know are very easy to trap and take this information, our role as OSINT analysts needs to know what is a hoax? Deceptive propaganda or deepfake, this is very vulnerable, especially in political news information
- Deepfake
Be careful with deepfake content, the rise and advancement of AI and ML technology is very useful and plays an important role, but there is abuse for the creation of deepfake contexts, in Indonesia there is a lot of deepfake content used to fool the public, memes, hoax news, propaganda, buzzers and for hacking activities to cheat the system. This is dangerous, especially for those of you who like photos, making videos with open and detailed faces, this can make material for making deepfake videos, always remember and be careful. Deepfake can be identified by looking carefully such as skin wrinkles, texture, color, animation, teeth and sound, a bad deepfake is mostly still stiff from the movement and sound, but if the maker is a professional person and there are many data samples then this deepfake will be better and almost like the original, it is difficult to distinguish but you can do manual analysis and use online tools to see the difference and take the narrative and then look deeper, whether it really happened or just a trick
- Timezone
Time zones, for those of you who are into geospatial analysis as well as time analysis and measurement, you must be familiar with these time zones. Time zones can be used as a search handle or paramater, many platforms such as Facebook and Youtube have different time zones in the metadata, you are required to see which area or region is in which time zone such as UTC+7 or something else. You can do time parsing to determine the accuracy of the time zone, and look at shadow and sun and weather conditions to determine which region it is, as well as determine the geolocation and time of a country. Time zones need to be known to determine the accuracy of data, and there are unix-like time formats and other time formats, so you need to do some parsing before presenting your data.
- Geospatial
This is the same as the time zone, but there are things I will explain again. Geospatial is a crucial part of determining data and accuracy data, many regions, countries and islands of the universe (Earth) and limited satellite data and street view to see the latest data or checking the data, or further investigation, in this context I sometimes find it difficult to determine the geolocation of various areas such as recognizing buildings, road markings and other things that can be used for search guide parameters, but when finding a location point there are sometimes limitations such as street view searches, satellite data and lack of data to conduct searches. Nowadays there are AI and ML that you can use to create building matching tools or other things to determine geolocation. Or you can use satellite data like NASA to find out possible locations that occur such as climate, fires, air pollution or congestion, you can use this as a further geolocation search. There are also rules and regulations in related countries or related places that cannot be visited carelessly by street view or censored by satellite, this also makes it difficult to determine geolocation, but there are several tools and sensors that you can use to see geolocation such as heat sensors, weather sensors, water, oil and gas, satellite data provider platforms and others that you can use. You can also use libraries from programming languages ââfor remote sensing analysis, building damage or disasters and others. You can read about "Automatic Detection of Damaged Buildings after Earthquake Hazard by Using Remote Sensing and Information Technologies" in international papers or journals for further knowledge. As in geospatial cases such as culture and customs, clothing, biodiversity and social environment in each country, investigators are required to understand each social and cultural role in each region or area to be analyzed.
- Mental and Pschology
The importance of maintaining mental health and thoughts. As humans, we must have limits such as the limit of patience or the lowest point and loss of energy in conducting analysis. If you often analyze disturbing content, it will drain our minds and mentality, especially pornographic content, sadistic murder scenes, disturbing sounds such as explosions, shouts etc. That are not suitable for frequent viewing. However, as investigators, sometimes we have to see content like that to analyze further. Then to maintain your mentality, my suggestion is to reduce viewing disturbing content a little, take a rest, and try to do other things such as hobbies or sports to rest your body, and control your emotions. If you have money or work in an institution, there are usually mental or psychological checks to keep workers fit and healthy, especially in jobs such as OSINT analysts or intelligence. Everyone has their own way of controlling emotions, knowing their body's mental limits and so on, so I can't explain this in more detail, but I can suggest consulting a psychiatrist to neutralize your mental state, or take a rest
- BIAS
As humans, we definitely have BIAS, I have explained this in the previous wiki. BIAS is something that exists in every human being, there are many types of BIAS such as:
Cognitive Biases
Anchoring Bias
Occurs when someone relies too heavily on the first piece of information they receive when making decisions.
Availability Heuristic
Happens when people estimate the likelihood of an event based on how easily examples come to mind.
Confirmation Bias
The tendency to search for, interpret, and remember information that confirms oneâs preexisting beliefs.
Framing Effect
People's decisions are influenced by how information is presented, rather than just the facts themselves.
Bandwagon Effect
Occurs when someone adopts a belief or behavior because many others are doing the same, even if theyâre uncertain.
Barnum Effect
The tendency to believe vague or general statements are personally meaningful and accurate, even though they apply to many people.
Hindsight Bias
The belief, after an event has occurred, that one could have predicted or expected the outcome all along.
Selection Biases
Sampling Bias
Happens when the selected sample does not accurately represent the overall population.
Non-Response Bias
Occurs when certain individuals do not respond to a survey, potentially skewing the results.
Voluntary Response Bias
Arises when participation is self-selected, often leading to overrepresentation of strong opinions.
Self-Selection Bias
Occurs when individuals choose to participate in a study on their own, possibly leading to biased results.
Exclusion Bias
Happens when researchers deliberately exclude certain individuals or data, affecting the studyâs accuracy.
Other Biases
Reporting Bias
Occurs when researchers selectively disclose or omit information, often favoring desired outcomes.
Courtesy Bias
Happens when participants give socially acceptable or polite responses rather than truthful ones.
Environmental Bias
Arises when the physical or social research environment affects participantsâ behavior or responses.
Recall Bias
Occurs when participantsâ memories influence their responses, often inaccurately.
Publication Bias
The tendency for studies with positive or significant results to be published more frequently than those with negative or inconclusive outcomes.
Cultural Bias
Happens when cultural values or norms influence how research is conducted or interpreted.
Observer Bias
Occurs when a researcherâs expectations or beliefs influence their observations or interpretation of data.
How to deal with BIAS?
How to Avoid Thinking Mistakes (Cognitive Biases)
Anchoring Bias
Donât believe the first thing you hear too much. Look at more facts before you decide.
Availability Bias
Just because something is easy to remember, doesnât mean it happens a lot. Check the real numbers.
Confirmation Bias
Donât only look for things that say you're right. Try to find things that say you're wrong too.
Framing Effect
Be careful with how things are said. The same thing can sound good or bad depending on the words used.
Bandwagon Effect
Donât follow the crowd just because everyone else is doing it. Think for yourself.
Barnum Effect
If something sounds like it fits everyone, itâs probably not special or true just for you.
Hindsight Bias
After something happens, donât say âI knew it!â unless you really did. Itâs easy to say that later.
How to Avoid Mistakes When Choosing People (Selection Bias)
Sampling Bias
Donât pick just one kind of person. Pick different types so itâs fair.
Non-Response Bias
If people donât answer, the results may be wrong. Try to ask again nicely.
Voluntary/Self-Selection Bias
If only people who really want to join answer, they might not be like everyone else. Thatâs a problem.
Exclusion Bias
Donât leave some people out on purpose. Everyone should have a fair chance to be picked.
How to Avoid Other Research Mistakes (Other Biases)
Reporting Bias
Donât hide bad results. Show everything you find, good and bad.
Courtesy Bias
Sometimes people say nice things to be polite. Let them know itâs okay to be honest.
Environmental Bias
The place where you ask questions can change answers. Choose a calm, quiet spot.
Recall Bias
People forget things. Try to ask questions soon after something happens.
Publication Bias
Donât only share good stories. Share all the results, even if theyâre boring.
Cultural Bias
People from different cultures think differently. Be respectful and open.
Observer Bias
If you expect something, you might see it even if itâs not there. Be careful and ask others to check too.
- Laws
Laws, I often mention this in my presentations and in this wiki, every country must have regulations and rules that must be obeyed, for details of the rules in your country you can read it yourself on the internet. With the rules and regulations need to be understood, this will make it difficult to search for data for example if you are looking for information about Chinese people it is quite difficult to search for data because they have regulations or rules in each search engine, platform and others, this will complicate the investigation. Another example, for example, in each country there is a frequency limit of hz - GHz, VHF, UHF for radio signals and can be monitored by the relevant parties, if you are careless you will be caught because of the frequency. Here is a basic example of this:
Legal and Regulatory Limits Every country has its own laws and regulations â whether it's about the internet, press policies, or other areas. Thatâs why ethics and privacy must always be respected.
Potential for Misuse There is always a risk that information, tools, or data could be misused. This is something to be aware of and to handle responsibly.
Internet Ethics Rules You probably know there are certain ethics when using the internet. Just because information is public doesn't mean it can be used freely â we should show respect, ask for permission when needed, and ensure our actions have a clear and legal purpose.
Integrity If you want to be a journalist, you must uphold integrity and trust. This includes being clear about how you find your data, how you analyze it, and being ready to take responsibility if any issues arise from your reporting.
Note "This is different from homeless journalism. If you want to be a professional journalist, it's best to be open â such as sharing your real name, what media outlet you represent, your legal status, and the protections you have under the law. I explained this more in the video linked earlier."
You can look up more about what homeless journalism is, and how it differs from legal and transparent journalism that follows ethical and legal standards. You can read more on here [OSINT] Permasalahan etika dan privasi serta resiko yang ada You can translete in English
- Risk
Risk, this is something that often happens in any condition, every job definitely has risks, so does OSINT. OSINT is very vulnerable to risks, for example OPSEC leaks, leaks of investigator identities, even data misuse and threats of murder, there are many factors that I can't really explain in detail. But you can see examples in real cases (real world) for example leaks of intelligence information, leaks of identities and the carelessness of the investigators themselves. Here are some examples
As investigators, we face the risk of data leaks or creating a fingerprint during investigations â for example through our IP address, user agent, device name, email, phone number, and more. This depends on the situation, which is why OPSEC (Operational Security) is so important.
When it comes to MASINT (Measurement and Signature Intelligence), there are also risks. MASINT has similar risks, but sometimes we still need to go to the physical location to verify the data, gather physical evidence, and ensure accuracy. MASINT can also be combined with HUMINT (Human Intelligence) â which means the role of informants and investigators must be handled carefully, and confidentiality must be maintained. Again, it all goes back to understanding OPSEC to prevent our digital fingerprints from being exposed.
So in this context â especially when collecting sensitive information â we must try to minimize our fingerprint as much as possible. Even though MASINT uses technology, direct fieldwork is still sometimes necessary to verify and cross-check data accuracy, and this also carries its own risks
There is also insider threat to, be careful. You can read this post "Kenapa OSINT beresiko?" You can translete in English XD
- Captcha and Rate Limiting (WAF)
Many sites or APIs are already protected by WAF (Web Application Firewall) this will make it difficult to collect data on the target site that we will scrap or create automation tools for collecting information. I often encounter captcha and rate limiting when scraping. There are captcha solver tools but this is paid but it is quite powerful to bypass captcha. For example, you want to create an email parsing tool registered on social media, you must know the request header and request body on the site in order to create an automation tool, but there must be obstacles such as rate limiting, IP blocking and others, as an IT security this is a specter that must be known and bypassed, you use existing tools such as browser use or similar tools sometimes you will be limited, many tools are available using 3rd party APIs and doing threads and IP changes and regex to automate when scrapping. If you have more money, I suggest buying a paid tool like Maltego and others like it, because this tool has been automated, has maintenance and rich data and has a list of sites that will be scrapped and certainly reduces the risk of your fingerprint leaking when searching and avoids loggers from libraries that are infected with malware.
- Confidential
As an OSINT investigator you need to know about the CIA Triad. To implement your OPSEC, this concept is very important to conduct investigations to reduce your data leakage information. You can read the definition of "CIA Triad" on the internet for further understanding
- Transaction

Transactions are important in collecting information, especially in analyzing the dark web or black market (cryptocurrency). As an investigator, you need to secure the identity of the address or wallet so that it does not leak and determine the currency to be used. Securing transactions such as in cryptocurrency is very important, especially in today's era, most crypto exchanges or crypto addresses must be registered with E KYC or KYC, and crypto transactions can be analyzed openly or publicly, so never link your address to your original account, this will be dangerous. We take for example threat actors (TA), they already know OPSEC to secure all transactions made such as doing Bitocin Laundry or Crypto Laundry, exchanging coins (Exhganer) without KYC or exchanging directly to intermediaries such as brokers or individuals to individuals, this transaction will complicate the search when no fingerpint is found, therefore crypto transactions are difficult to investigate, it can happen if there is human error.
- Combat Tools (OPSEC)
Tools are a must have. Because tools are very helpful to speed up searching the information, analysis, and reduce your risk in conducting investigation. Like tools for conducting penetration testing (red team) such as intercepting rest API requests, frequency signals and sensors, this can help you analyze existing data and speed up and simplify information searches. Don't forget you have to provide your labs for OSINT such as cloud VMs or VMs that you create for sandboxing in conducting investigations and sockpuppet accounts that must be made as original (authentic) without involving your data, prepare your combat tools to conduct investigations. It should be remembered that tools and OPSEC are important for your security, especially on internet networks. Reduce your fingerprint from leaking on the internet and always be careful of dangerous documents such as macro files, or exe-based PDFs and unknown zero days, with sandboxing and tools available to protect your identity.
- Language
Language, in this world (Earth) there are many languages ââsuch as Chinese, Indonesian, Japanese and others. I personally sometimes find it difficult to determine the language, because there are so many languages ââand slang, swear words and unknown emojis, typing each country and linguistics (Speaking Accent) example like Indian accent, Chinese Accent, Indonesian Accent and other things. I have used AI models, this also does not help a lot of misleading that occurs, especially in Chinese and slang, typing, many OSINT investigators have great difficulty understanding sentences or the context of the language to be studied, therefore someone who understands the language or who is in the area is needed for certainty of meaning and to know its condition, if you use an online translator and AI, believe me this is still difficult to understand, so my suggestion is also to learn other languages ââto strengthen our own knowledge or provide your own translator in each area. Maybe AI in the future will be better and more advanced, suppose there is an AI agent that can transalte perfectly with the existing dataset and the training data is good, we can wait for this development if it exists then this will help investigators in understanding the language
- Platform ToS (Term and services)

Platform, every country must have its own platform like Indonesia many use Facebook or other social media, so with China there are differences in platforms like QQ, Weibo, We chat, Douyin and others. You need to know what platforms are often used in each country. And each platform has its own rules such as registration using the original number from that country, address and KYC that applies in that country. This makes it difficult for investigators to dive into the platform, and always remember that each country has its own internet rules, so it needs to be known and always researched, maybe this article will help you check this "SOCMINT for China"
- Knowing Your Area
Understand your search area, this is a blind spot for OSINT practitioners. If you do not know the area you are going to explore, it is very difficult, because in the search process you have to narrow down or narrow the data search, if you do not know this, the information presented will be very broad, including paid tools, so what should you do? As in the previous point, you need to know the region, culture, social media that is often used and the rules that exist in countries in this world. This is very important, so always read and research to increase your knowledge. And don't forget to look at your target area, for example, like a threat actor (TA). Usually, people like this already know OPSEC and have adequate skills in terms of how to communicate, tools and their own security such as networks or encrypted requests. This is a problem for investigators
- Blind Spot
Know your blind spots, for example you lack in language, knowledge of weapons (SALW) or physics. Blind spots are also a bugbear for investigators. Therefore, it is necessary to work in a team and be organized, each person has their own blind spots, therefore always fill in new knowledge, both basic and advanced, for your general knowledge
- Espionage and Wiretapping or Human Error
Many investigators or OSINT practitioners are exposed to wiretapping or espionage or human error, when conducting searches, for example, they are trapped in social engineering, forgetting about OPSEC which will later become a leak in your strategy, therefore prepare AV and sandboxing for further protection, especially when downloading attachments that we do not know about. Also be careful of unknown zero day attacks. Many of these unpublished security holes are like business if you know the zerodium platform. These zero day attacks are very vulnerable that will not be known. Be careful with scrappers, many bots and scrapper accounts are scattered to collect information such as chat, media, phone numbers and documents in your group, channel or website and social media
- User or Account Privacy Settings
There are many accounts out there that already understand what privacy is. So that they have configured the privacy of each social media account or smartphone for security, for example, such as locked accounts, resticted ages, temporary emails or every post has been made settings such as blocking comments and cannot be shared publicly, this will make it difficult to investigate because with this privacy, investigators will make extra efforts to obtain this information. Therefore, setting your account and smartphone to privacy for security on the internet, many have provided tutorials or you can find out in your respective account or smartphone device
- Format Data
Data format, in collecting information some investigators must have encountered strange data formats such as strange fonts, emojis or in the form of unicode, this will be a challenge in scrapping, especially if you focus on keywords so it is likely that your scrapper does not find the topic or there is encrypted data, now you can use techniques such as XPATH and CSS selector to analyze this and there is one more thing, every country has a different typing format such as slang or abbreviated words, other typing and there is a possibility that your scrapper target has a different typing format using a mixture of emoji or unicode and strange formats, I often find this on forum and social media sites
- Lack of Data and Validate Data
As for the challenges in collecting information such as little data obtained or too much data obtained, this is an important thing to do such as sorting data as well as narrowing data such as the name of the target and its region or age, if little data is obtained then what must be done is to try to dive further, there are still many sources of sources that you can take such as from paid OSINT platforms, HUMINT or CSINT roles or you search for your own data and do your own scrapping.
Data verification, this part needs to be understood by investigators, such as sorting data, collecting scenarios that occur, analyzing sources that have been found, for example whether this name goes to school in A and the location is in B then try to find other connecting data, for example social media, close friends' posts and other things to be used as clues in verifying data. I have discussed verification techniques in this Wiki, please read it yourself, it is related to 5W + 1H.
- Data Censorship
Sometimes when searching for data, there is data that is censored, such as censorship of text, faces, satellite data, and others. There are various types of censorship, such as web UI or app censorship, or censorship using mosaics, as well as standard censorship methods like using a brush (iOS) and others. I have tried several techniques to remove censorship (unblur), but for mosaics it is very difficult. I have tried using tools from this repository, ChatGPT, and other editing methods, but the results were unsatisfactory. However, some censorship can still be seen, such as when the censorship is not thick enough and there are gaps, example, like the sensor on iOS, which can be bypassed with manual editing (depending on the censor). If the user is careless, it can still be seen. If it's satellite censorship, you can look for other satellite resources, such as Maxar, Bing, Planet, NASA, Sentinel, and others. Usually, you can see satellite data to bypass the censorship. If you know more how to unblur on data censorship, let me know!
- Service or System Changes
In OSINT (Open Source Intelligence) practice, changes to endpoints, algorithms, APIs (such as end-to-end requests), and HTTP requests from various platformsâespecially social media and search enginesâpose significant challenges. Even minor changes, such as altered URL structures, stricter API rules, or discontinued services (dead services), can render the OSINT tools youâve developed ineffective. Additionally, major platforms frequently update their data access rules and policies, both technically and legally. For example, changes to search algorithms can disrupt scraping logic, or API quota restrictions can cause your system to be quickly blocked. As a result, you must spend additional time re-learning the systems and behaviors of each platform, and re-adjusting your tools to remain relevant and functional.
- Insider
As for sources of information from insiders, there are special rooms for exchanging information with insiders to obtain information, whether paid or rented, or disguised. For example, there are insiders who leak secrets about the lighting in city A, and then the enemy obtains this information to prepare for battle. Insiders usually infiltrate and blend into organizations or communities. Therefore, insiders are difficult to detect; the more tactically sophisticated they are, the harder it is to identify who the insiders are
- CSINT
Closed intelligence sources, or CSINT. Civilians and journalists cannot view this data. Civilian residents and journalists may cite this data at press conferences or in publications, as it is classified and only the government or specific entities with authorized access may obtain it. Examples include health information (PHI), bank information, information from informants, legal information, and others. This data is classified and likely contains sensitive information. This data can be utilized by law enforcement agencies, governments, or international entities for further investigations, clue searches, real-time data, and narrowing down searches. Therefore, access to this data is not available to just anyone.
- Grey Literature
Grey literature is a type of information produced outside traditional publishing and distribution channels, such as scientific journals or books. Examples include research reports, conference papers, theses, dissertations, government documents, and other similar materials. Grey literature is often not indexed in general bibliographic databases and may be difficult to find or access through normal publication channels
- Unique Data
Unique data refers to data that is permanent or has distinguishing characteristics that cannot be duplicated. Examples include identification numbers, email addresses, mobile phone numbers, usernames, IDs, transaction IDs, fingerprints, genes, iris patterns, IMSI, IMEI, bank account numbers, or other similar information. This unique data can be used to narrow down your search results to prevent them from spreading too widely. Imagine you're searching for a nameâthere are many similar names in the world, and while some names are unique, they may not be the right ones. To speed up your search, use unique data as a search guide to make it faster and more targeted. For example, from an email address, you can obtain a social media username, along with phone number and location information. You can search for that data based on those clues, for example, if balabala@gmail.com is registered on Instagram, there will be location, friends, and phone number information. You can search faster by category, such as name or phone number in that area.
- Knowing the Context
Know the context, for example you are collecting information but you do not know the context or the basis of the information, well this will make it difficult to do a search for example you do not know the topography for example the slopes on Mount Rinjani Indonesia where the slopes are very steep and deep and the area is soft or gray soil and extreme weather or the context of the story or sources in an area then the result is you will think without data or knowledge that results in opinion leadership, propaganda and other basic information that you miss so know the context for the basis of your information for example try looking for satellite data, seasons or sensors, measurements and searching for further information
- Unclear Information (ambiguous)
This is similar to the previous point about âknowing the context,â but I will explain again about conflicting information. Many netizens often make comments without fully understanding the information, relying on hearsay and fake news (hoaxes), such as claims that a shaman can find missing people using method A, B, or C, or can multiply money, or the emergence of fake news, for example, that the president of Country A is distributing free glasses and spray, which in reality has no effect or is just a scam, i.e., fake. Your role here is to try to use reasoning, logic, and knowledge from fields such as geography, social sciences, religion, and others to verify the information and conduct research to find scientific data and evidence to support your arguments when presenting the information
- Deleted Data (source)
Deleted data makes it difficult to find information. For example, there is a news article that has been indexed by Google or other search engines, but the content has been deleted. Try using Google cache, cache in other search engines such as Bing, Yandex, and others. Also check social media, you might find something there. If you can't find it, try searching on the Wayback Machine or other archive sites. Enter the domain and search for the article you're looking for. However, it's important to note that archived data isn't guaranteed to be found, as it depends on whether someone has already archived it online, taken a screenshot, used Httrack, or shared it on social media or reuploaded it
*Please patient i will added soon
Last updated