Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(etls): add all fields for DEVCO ETL - EUBFR-245 #196

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 44 additions & 4 deletions docs/types/Project.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,12 @@ Type: [Budget][4]

### Properties

- `devco_equity` **[BudgetItem][1]?**
- `devco_guarantee` **[BudgetItem][1]?**
- `devco_interest_rate_subsidy` **[BudgetItem][1]?**
- `devco_investment_grant` **[BudgetItem][1]?**
- `devco_loan` **[BudgetItem][1]?**
- `devco_ta` **[BudgetItem][1]?**
- `eu_contrib` **[BudgetItem][1]**
- `funding_area` **[Array][5]<[string][2]>**
- `mmf_heading` **[string][2]**
Expand Down Expand Up @@ -160,25 +166,57 @@ Type: [Timeframe][16]
- `to` **([string][2] | null)**
- `to_precision` **[TimePrecision][15]**

## SimpleValueField

Describes a generic field in an ETL which does not have any other specific structure.

Type: [SimpleValueField][17]

### Properties

- `raw` **[string][2]**
- `formatted` **[string][2]**

## TypedValueField

Describes a field which has a certain type of value.

Type: [TypedValueField][18]

### Properties

- `field` **[string][2]**
- `type` **[string][2]**
- `raw` **[string][2]**
- `formatted` **[string][2]**

## Project

Describes `project`.

Type: [Project][17]
Type: [Project][19]

### Properties

- `action` **[string][2]**
- `budget` **[Budget][4]**
- `call_year` **[string][2]**
- `comments` **[string][2]**
- `complete` **[boolean][20]**
- `description` **[string][2]**
- `devco_arei_projects_endorsement` **[SimpleValueField][17]?**
- `devco_cris_number` **[SimpleValueField][17]?**
- `devco_date_entry` **([string][2] | null)?**
- `devco_lead_investor` **[SimpleValueField][17]?**
- `devco_leverage` **[SimpleValueField][17]?**
- `devco_project_stage` **[SimpleValueField][17]?**
- `devco_results_indicators` **[Array][5]<[TypedValueField][18]>?**
- `ec_priorities` **[Array][5]<[string][2]>**
- `media` **[Array][5]<[Media][12]>**
- `programme_name` **[string][2]**
- `project_id` **[string][2]**
- `project_locations` **[Array][5]<[Location][9]>**
- `project_website` **[string][2]**
- `complete` **[boolean][18]**
- `related_links` **[Array][5]<[RelatedLink][13]>**
- `reporting_organisation` **[string][2]**
- `results` **[Result][14]**
Expand Down Expand Up @@ -207,5 +245,7 @@ Type: [Project][17]
[14]: #result
[15]: #timeprecision
[16]: #timeframe
[17]: #project
[18]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean
[17]: #simplevaluefield
[18]: #typedvaluefield
[19]: #project
[20]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean
167 changes: 162 additions & 5 deletions docs/types/etls/devco-xls.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,22 +24,43 @@ Returns **Project** JSON matching the type fields.

### getBudget

Preprocess `budget`
Preprocess `budget`.

Input fields taken from the `record` are:

- `Total EU Contribution (Million Euro)`
- `Total Budget (Million Euro)`
- `Investment Grant (Million Euro)`
- `TA (Million Euro)`
- `Interest Rate Subsidy (Million Euro)`
- `Guarantee (Million Euro)`
- `Equity (Million Euro)`
- `Budget Support (Million Euro)`
- `Loan (Million Euro)`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **Budget**

### getComments

Preprocess `comments`.

Input fields taken from the `record` are:

- `Comments`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **[String][4]**

### getDescription

Preprocess `description`
Preprocess `description`.

Input fields taken from the `record` are:

Expand All @@ -49,6 +70,141 @@ Input fields taken from the `record` are:

- `record` **[Object][3]** The row received from parsed file

Returns **[String][4]**

### getCrisNumber

Preprocess `devco_cris_number`.

Input fields taken from the `record` are:

- `CRIS No or ExCom Des`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **SimpleValueField**

### getInvestor

Preprocess `devco_lead_investor`.

Input fields taken from the `record` are:

- `Lead Financier`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **SimpleValueField**

### getLeverage

Preprocess `devco_leverage`.

Input fields taken from the `record` are:

- `Leverage`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **SimpleValueField**

### getResultsIndicators

Preprocess `devco_results_indicators`.

Input fields taken from the `record` are:

- `1.1 Access on grid electricity ('000 people)`
- `1.2 Access mini grid electricity ('000 people)`
- `1.3 Access off-grid electricity ('000 people)`
- `1.4 Inferred access (additional generation) ('000 people)`
- `1.5 Inferred access (cross-border transmission) ('000 people)`
- `1.6 Access to biomass/biogas clean cooking ('000 people)`
- `1.7 Access to LPG/ethanol cooking ('000 people)`
- `1.8 Electricity from renewables (GWh/year)`
- `1.9 Renewable generation capacity (MW)`
- `1.10 Electricity from energy efficiency (liberated capacity) (MW)`
- `1.11 Transmission lines (km)`
- `1.12 Distribution lines (km)`
- `1.13 Energy Savings (MWh/year)`
- `1.14 GHG emissions avoided per year (ktons CO2eq)`
- `1.15 No of direct jobs person/year (construction)`
- `1.16 No of permanent jobs (operation)`
- `2.1 Direct and Inferred electricity access ('000 people)`
- `2.2 Clean cooking and fuel access ('000 people)`
- `2.3 Direct and Inferred access to energy ('000 people)`
- `2.4 Electricity from renewabes (GWh/year)`
- `2.5 Reneable generation capacity (MW)`
- `2.6 Electricity generation capacity (MW)`
- `2.7 Transmission and distribution lines (km)`
- `2.8 GHG emissions avoided per year (ktons CO2eq)`
- `2.9 No of direct and permanent jons (construction and operation)`
- `BET1 (Access to energy)`
- `BET2 (Renewable energy generation and energy efficiency)`
- `BET3 (Contribution to the fight against climate change)`
- `EURF 1 (No of people provided with access to electricity with EU support)`
- `EURF 2 (Renewable energy production supported by the EU)`
- `EURF 3 (GHG emission avoided)`
- `SDG 7.1.1 Percentage of population with access to electricity)`
- `SDG 7.1.2 (Proportion of population with primary reliance on clean fuels and technology)`
- `SDG 7.2.1 Renewable energy share in the total final energy consumption)`
- `SDG 7.3.1 (Energy intensity measured in terms of primary energy and GDP)`
- `SDG 8.3.1 (Proportion of informal employement in non-agriculture employment, by sex)`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **[Array][5]<TypedValueField>**

### getProjectStage

Preprocess `devco_project_stage`.

Input fields taken from the `record` are:

- `Project Stage`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **SimpleValueField**

### getDateEntry

Preprocess `devco_date_entry`.

Input fields taken from the `record` are:

- `Date of data entry`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **[Date][6]** The date formatted into an ISO 8601 date format

### getAreiProjectsEndorsement

Preprocess `devco_arei_projects_endorsement`.

Input fields taken from the `record` are:

- `For AREI Projects (Endorsement) Y/N`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **SimpleValueField**

### getCodeByCountry

Gets country code from a country name.
Expand All @@ -61,7 +217,7 @@ Returns **[String][4]** The ISO 3166-1 country code

### getLocations

Preprocess `project_locations`
Preprocess `project_locations`.

Input fields taken from the `record` are:

Expand All @@ -76,7 +232,7 @@ Returns **[Array][5]**

### getResults

Preprocess `results`
Preprocess `results`.

Input fields taken from the `record` are:

Expand Down Expand Up @@ -125,7 +281,7 @@ Returns **Result**

### getType

Preprocess `type`
Preprocess `type`.

#### Parameters

Expand All @@ -138,3 +294,4 @@ Returns **[Array][5]** Project types
[3]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Object
[4]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String
[5]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array
[6]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Date
30 changes: 30 additions & 0 deletions resources/elasticsearch/mappings/project.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,35 @@ const textWithKeyword = {
fields: { keyword: { type: 'keyword', ignore_above: 256 } },
};

const simpleValueField = {
properties: {
raw: { type: 'text' },
formatted: { type: 'text' },
},
};

const typedValueField = {
properties: {
field: { type: 'keyword' },
type: { type: 'keyword' },
raw: { type: 'text' },
formatted: { type: 'text' },
},
};

module.exports = () => ({
mappings: {
project: {
properties: {
action: { type: 'text' },
budget: {
properties: {
devco_equity: budgetItem,
devco_guarantee: budgetItem,
devco_interest_rate_subsidy: budgetItem,
devco_investment_grant: budgetItem,
devco_loan: budgetItem,
devco_ta: budgetItem,
eu_contrib: budgetItem,
funding_area: textWithKeyword,
mmf_heading: textWithKeyword,
Expand All @@ -38,9 +60,17 @@ module.exports = () => ({
},
},
call_year: { type: 'text' },
comments: { type: 'text' },
computed_key: { type: 'keyword' },
created_by: { type: 'keyword' },
description: { type: 'text' },
devco_arei_projects_endorsement: simpleValueField,
devco_cris_number: simpleValueField,
devco_date_entry: { type: 'date' },
devco_lead_investor: simpleValueField,
devco_leverage: simpleValueField,
devco_project_stage: simpleValueField,
devco_results_indicators: typedValueField,
ec_priorities: textWithKeyword,
last_modified: { type: 'date' },
media: {
Expand Down
1 change: 1 addition & 0 deletions services/ingestion/etl/cordis/csv/src/lib/transform.js
Original file line number Diff line number Diff line change
Expand Up @@ -424,6 +424,7 @@ export default (record: Object): Project | null => {
action: '',
budget: getBudget(record),
call_year: '',
comments: '',
description: getDescription(record),
ec_priorities: [],
media: [],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Object {
},
},
"call_year": "",
"comments": "",
"complete": true,
"description": "RCN: 30802
Objectives: %LTo understand the mechanisms leading to the formation of nitrous acid on the
Expand Down Expand Up @@ -107,6 +108,7 @@ Object {
},
},
"call_year": "",
"comments": "",
"complete": true,
"description": "rcn: 215949
acronym: UNISECO
Expand Down Expand Up @@ -523,6 +525,7 @@ Object {
},
},
"call_year": "",
"comments": "",
"complete": true,
"description": "rcn: 14088
objective: As a result of the rapid increase in forest damages in Mid-Europe, the need for the reduction of air pollutions from energy conversion and energy-end-use technologies became an important political objective.",
Expand Down Expand Up @@ -633,6 +636,7 @@ Object {
},
},
"call_year": "",
"comments": "",
"complete": true,
"description": "rcn: 215954
acronym: SMARTER
Expand Down
Loading