Mission & Philosophy - Docs | renewablesinfo.org

Why renewablesinfo.org exists, our philosophy on open data, and the gap we are filling.

renewablesinfo.org is an open-source intelligence platform for the US power plant fleet. It assembles data from eight federal and open sources into a single searchable, filterable, exportable interface — covering 15,053 plants and 29,172 generators across every fuel type, state, and ownership structure.

This page explains why we built it, what we believe, and where it is going.

Why this exists now

The energy industry is in the middle of two simultaneous transitions. The first is the shift from fossil to renewable generation — a physical transformation of the grid that is already well underway. The second is the shift from human-only data consumption to AI-assisted intelligence — a transformation in how decisions get made about that grid.

Both transitions need the same thing: granular, structured, fresh, plant-level data that machines can read as easily as humans. That data exists — the US government publishes it through the EIA, FERC, DOE, and affiliated programs. But it is scattered across dozens of filing programs, published in incompatible formats, and requires significant assembly work before it becomes useful.

renewablesinfo.org does that assembly work once, openly, so nobody has to do it again.

The gap we are filling

The energy data market is well-served at the extremes. At one end, the government publishes raw data — comprehensive but difficult to use. At the other end, commercial intelligence platforms sell curated datasets for tens of thousands of dollars per year — polished but locked away.

The gap is in the middle: asset-level detail, assembled across sources, structured for analysis, and freely available.

Most open energy datasets give you aggregated statistics — total solar capacity by state, average wind capacity factors by region. Those are fine for high-level reporting. They are useless for the questions that analysts, developers, and AI agents actually need to answer: Which specific plants does NextEra operate in ERCOT? What is the capacity factor of this particular solar farm? Who owns the generators at this hybrid facility, and when were they commissioned?

These are asset-level questions. They require asset-level data. That is what we provide.

Why open source

The raw data we use is public. The EIA publishes it. FERC publishes it. LBNL, USGS, and OpenStreetMap contribute to it. Anyone with enough patience can download the same files and build the same database.

The value is not in the raw data — it is in the assembly. Matching plant records across EIA forms. Resolving ownership through GEM entity chains and Wikidata identifiers. Joining generation histories with engineering specifications. Mapping plants to grid nodes through substation crosswalks. Computing capacity factors, rankings, and coverage statistics. That assembly work is where the intelligence lives.

We believe that assembly layer should be open. Three reasons:

Better agentic systems need better data foundations. AI agents that can reason about energy markets — comparing owners, evaluating portfolios, tracking generation trends — need structured, granular, machine-readable data underneath. That data layer is infrastructure, not a product. Infrastructure should be open.
Open data compounds. When researchers, modelers, journalists, and developers can build on a shared data layer, the ecosystem gets smarter collectively. A financial model built on our data validates it. A research paper that cites it extends its reach. An agent that queries it proves its structure. Every consumer makes the data more valuable for every other consumer.
The alternative is fragmentation. Without a shared open layer, every organization that needs plant-level data builds its own internal version — duplicating effort, introducing inconsistencies, and creating data silos that cannot interoperate. We have seen this happen across energy analytics. It is wasteful and avoidable.

What we believe

These principles guide every decision — from schema design to UI layout to what we choose to build next.

Asset-level detail matters more than aggregated reports.

A total for California solar capacity is a statistic. The specific plants, their owners, capacities, generation histories, and grid connections — that is intelligence.

Data should be shaped for its consumer.

The same plant data exists in multiple forms: JSONB blobs for the web app, flat index for search, parquets for the pipeline. Each shape serves a different consumer. Future consumers — AI agents, financial models, researchers — will get their own shapes.

Freshness is a feature, not a pipeline problem.

Stale data kills trust faster than missing data. We invested in incremental pipelines and automated refresh not because it was technically interesting, but because a user who returns and finds the same data as last month will stop returning.

Usability is as important as the data itself.

The same data, presented differently, has radically different utility. A searchable, filterable, exportable interface makes 15,053 plants usable. A CSV dump makes them available. Those are not the same thing.

Build for the knowledgeable user.

We design for energy analysts, developers, researchers, and policymakers. They know what nameplate capacity means. They are comfortable with dense data tables. We do not simplify for casual visitors at the cost of precision for professionals.

What the platform covers

Every US power plant registered with the EIA — across all fuel types, all states, all ownership structures. For each plant, we assemble data across multiple dimensions:

Ownership

Utility, operator, parent company, entity chain

Engineering

Generator specs, solar panels, wind turbines

Generation

Monthly MWh, capacity factors, annual trends

Financial

FERC costs, LBNL solar prices, FERC EQR contracts

Grid & Pricing

Balancing authority, ISO region, nearby nodes

Context

Wikipedia summaries, news articles, external links

Where this is going

The current platform is a reference tool — you search for a plant, you see its details, you explore and filter across the fleet. That is valuable on its own. But it is the starting point, not the destination.

The data architecture is designed to evolve in three additive layers:

Now — the web app. Plant detail pages, search, explore with filters, data export. Serving analysts and researchers who need structured access to the full US fleet.
Next — structured analytics. Normalized database tables for entities, generators, ownership chains, and spatial data. Enabling cross-plant queries, portfolio analytics, and corporate graph traversal.
Future — the intelligence layer. Knowledge articles linked to plants, semantic search via embeddings, and an agent-facing interface that translates natural language to structured queries. The goal: ask the database a question in English and get a precise, sourced answer.

Each layer is additive — nothing gets replaced, only extended. The web app continues to serve its users. New capabilities arrive for new audiences without disrupting what already works.

For bulk data access, partnership inquiries, or to discuss how renewablesinfo.org can support your work, contact us at hi@renewablesinfo.org

Explore the data