Sources and methodology
Our aim is to help you discover villages in Spain with practical, verifiable information. To do this, we gather data from public sources and, whenever possible, link to the original source.
1) Information sources
Some content may include data from public bodies and official portals, such as:
- INE (Instituto Nacional de Estadística): public statistics and series.
- Town councils and municipal websites/portals (when an official link is available).
- Regional tourism (official portals of each region).
- Ministry of Industry and Tourism and Spain.info (Turespaña), as tourism support.
- Open data (Dataestur / datos.gob.es): reusable official datasets (e.g. heritage, natural spaces, etc.).
- HERE (places/POIs): nearby restaurants, hotels and services for orientation (always attributed and subject to availability).
- Other open public sources that are cited/linked when used.
2) Methodology and automation
We process and structure public information to present it in a more useful way. To do this we use automation and artificial-intelligence tools at several stages:
- Data synthesis: we aggregate and combine information from different public sources to generate coherent fact sheets.
- Technical checks: we use automated scripts to detect broken links, validate data and maintain quality.
All generated content is reviewed and supervised before publication. AI is a tool; it does not replace editorial judgement.
3) Village DNA: how we profile each municipality
Every municipality on this site has a tourism profile built across seven dimensions: romance, gastronomy, nature, history, family-friendliness, instagrammability and logistics difficulty. Each profile is generated by AI models (Gemini) that search open sources for verifiable facts — population figures, heritage listings, geographic features, public transport availability — and convert them into structured scores from 0 to 100.
To prevent meaningless averages, every profile passes an anti-flat validation: at least one dimension must score 75 or above, and the spread between highest and lowest must be 35 points or more. Profiles that fail are flagged and regenerated. The result is not subjective opinion; it is a structured synthesis of publicly available data.
4) Content quality control
Each article goes through automated analysis before publication. A pipeline based on spaCy (natural language processing) runs named-entity recognition to measure the ratio of verifiable facts to generic filler in every text. Articles must meet minimum thresholds — calibrated per language — or they are returned for rewriting.
A separate cliché detector flags banned expressions: 60+ patterns for Spanish articles and 98+ for English. Phrases like "hidden gem", "steeped in history" or "a must-visit" are rejected automatically. After these checks pass, every article receives a final human review. We use AI the way a baker uses a kneading machine — it handles repetitive work, but the baker decides what goes into the oven.
5) Update cycle
Population data comes from INE (Instituto Nacional de Estadística) and is refreshed with each annual release. Images are reviewed and expanded periodically through Flickr API searches combined with AI-assisted selection that filters for relevance and quality.
Articles are updated when significant changes occur — a new UNESCO designation, a major infrastructure project, or the addition of a local festival to official calendars. Village DNA profiles are recalculated when new open-data sources become available or when existing sources publish revised datasets.
6) Data cross-verification
We do not rely on a single source. Key data points are checked against multiple references:
- INE for population, demographics and municipal boundaries.
- Official town council websites for heritage sites, local festivals and public services.
- Regional tourism portals (e.g. turismodeasturias.es, andalucia.org) for routes, activities and visitor information.
- Open data repositories (datos.gob.es, Dataestur) for heritage inventories, natural-space catalogues and tourism statistics.
When sources conflict, we prioritise the most recent official publication and note the discrepancy where relevant.
7) Important: informational purpose (disclaimer)
The information published on pueblosespanoles.es is provided for informational and educational purposes. Although we strive to keep it up-to-date and referenced, it may contain errors, be incomplete or become outdated. This site does not represent any public administration.
8) Corrections and contact
If you spot an inaccuracy, an incorrect official link or wish to suggest an update, please contact us. We review and correct information as quickly as possible. You can also learn more about who maintains this project on About.
Note: external links lead to third-party sites and are subject to their own terms and policies.