Skip to content

Commit

Permalink
refactor(budgetFormatter): improve work with monetary information - E…
Browse files Browse the repository at this point in the history
…UBFR-208 (#163)

Reopens #161 and builds upon it.

# PR description

For a list of scenarios covered and considered, please refer to the test suits in `lib/test/unit/*.spec.js`.

## QA Checklist

When you add a new ETL/producer, please check for the following:

* [ ] Producer's secrets are stored safely
* [ ] Ensure the ETL is added to the corresponding `scripts/` for automated deployment and deletion 
* [ ] There is at least 1 unit test with a jest snapshot for the transform function of the ETL
* [ ] Update the file PRODUCERS_DATA_AVAILABILITY_GRID.md indicating which fields are available in source data of ETL
* [ ] Ensure there is (flow/jsdocs) documentation in the `tranform.js` file which is to be used for automated documentation
* [ ] Generate the necessary documentation pages for the new ETL by `yarn docs:md`
  • Loading branch information
kalinchernev authored and yhuard committed Oct 11, 2018
1 parent 5567376 commit 9a0e361
Show file tree
Hide file tree
Showing 16 changed files with 3,019 additions and 119 deletions.
69 changes: 28 additions & 41 deletions docs/types/etls/inforegio-xml.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Transform function: [implementation details][2]

**Parameters**

- `record` **[Object][3]** The row received from parsed file
- `record` **[Object][3]** The row received from parsed file

Returns **Project** JSON matching the type fields

Expand All @@ -19,13 +19,13 @@ Check if field is an array or a sting

**Parameters**

- `data` **([Object][3] \| [string][4])** The input piece of data
- `data` **([Object][3] \| [string][4])** The input piece of data

**Examples**

```javascript
input => ['foo']
output => 'foo'
input => ['foo'];
output => 'foo';
```

Returns **[string][4]** The string value of the input data
Expand All @@ -36,13 +36,13 @@ Format date

**Parameters**

- `date` **[Date][5]** Date in `DD/MM/YYYY` format
- `date` **[Date][5]** Date in `DD/MM/YYYY` format

**Examples**

```javascript
input => "02/02/2018"
output => '2018-02-02T00:00:00.000Z'
input => '02/02/2018';
output => '2018-02-02T00:00:00.000Z';
```

Returns **[Date][5]** The date formatted into an ISO 8601 date format
Expand All @@ -53,13 +53,13 @@ Get adress from different fields

Input fields taken from the `record` are:

- `Beneficiary_address`
- `Beneficiary_Post_Code`
- `Beneficiary_City`
- `Beneficiary_address`
- `Beneficiary_Post_Code`
- `Beneficiary_City`

**Parameters**

- `record` **[Object][3]** The row received from parsed file
- `record` **[Object][3]** The row received from parsed file

Returns **[string][4]** The address as consumed by {Partner}

Expand All @@ -69,14 +69,7 @@ Formats information for the {BudgetItem}

**Parameters**

- `budget` **[string][4]** Prefixed currency value

**Examples**

```javascript
input => "EUR 329 000 000"
output => "329000000"
```
- `budget` **[string][4]** Prefixed currency value

Returns **BudgetItem** The formatted budget

Expand All @@ -86,17 +79,17 @@ Get funding areas from a string

Input fields taken from the `record` are:

- `Funds`
- `Funds`

**Parameters**

- `record` **[Object][3]** The row received from parsed file
- `record` **[Object][3]** The row received from parsed file

**Examples**

```javascript
input => 'foo;bar;baz'
output => ['foo', 'bar', 'baz']
input => 'foo;bar;baz';
output => ['foo', 'bar', 'baz'];
```

Returns **[Array][6]** List of values for funding area
Expand All @@ -107,7 +100,7 @@ Gets NUTS code level from a string

**Parameters**

- `code` **[String][4]** The NUTS code
- `code` **[String][4]** The NUTS code

Returns **[Number][7]** The level of NUTS or null if one can't be extracted

Expand All @@ -117,13 +110,13 @@ Get a list of {Location}

Input fields taken from the `record` are:

- `Project_country`
- `Project_region`
- `Project_NUTS2_code`
- `Project_country`
- `Project_region`
- `Project_NUTS2_code`

**Parameters**

- `record` **[Object][3]** The row received from parsed file
- `record` **[Object][3]** The row received from parsed file

Returns **[Array][6]** List of {Location}

Expand All @@ -133,17 +126,17 @@ Get themes from a string

Input fields taken from the `record` are:

- `Themes`
- `Themes`

**Parameters**

- `record` **[Object][3]** The row received from parsed file
- `record` **[Object][3]** The row received from parsed file

**Examples**

```javascript
input => 'foo; bar; baz'
output => ['foo', 'bar', 'baz']
input => 'foo; bar; baz';
output => ['foo', 'bar', 'baz'];
```

Returns **[Array][6]** List of values for themes
Expand All @@ -155,25 +148,19 @@ Depends on getAddress()

Input fields taken from the `record` are:

- `Beneficiary`
- `Beneficiary_Country`
- `Beneficiary`
- `Beneficiary_Country`

**Parameters**

- `record` **[Object][3]** The row received from harmonized storage
- `record` **[Object][3]** The row received from harmonized storage

Returns **[Array][6]** List of a single {Beneficiary}

[1]: https://github.com/ec-europa/eubfr-data-lake/blob/master/services/ingestion/etl/inforegio/mapping.md

[2]: https://github.com/ec-europa/eubfr-data-lake/blob/master/services/ingestion/etl/inforegio/xml/src/lib/transform.js

[3]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Object

[4]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String

[5]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Date

[6]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array

[7]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number
74 changes: 73 additions & 1 deletion lib/budgetFormatter.js
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,82 @@ const fixedRates = {
FIM: 5.94573,
};

// Ensures a given set of exceptions are handled properly before numeral.
export const prepareValue = value => {
const getCharCount = (char, inputString) =>
(inputString.match(RegExp(char, 'g')) || []).length;

// Leave integers, floats and other non-strings intact.
if (typeof value !== 'string') return value;

// Return original value if there are no dots or commas in the string.
if (!value.includes('.') && !value.includes(',')) return value;

let formatted = value;
let hasAbbreviation = false;

// Handle abbreviations with caution.
const abbreviations = ['k', 'm', 'b', 't'];

abbreviations.forEach(abb => {
if (formatted.includes(abb)) {
// Change comma to dot for numeral, that's all.
formatted = formatted.replace(/,/g, '.');
hasAbbreviation = true;
}
});

let dots = getCharCount('\\.', value);
let commas = getCharCount('\\,', value);

// Cleanup cases of multiple dots or commas.
// They surely don't denote fractions.
if (dots >= 2) {
formatted = formatted.replace(/\./g, '');
}

if (commas >= 2) {
formatted = formatted.replace(/,/g, '');
}

// Cleanup spacing.
if (!hasAbbreviation) {
formatted = formatted.trim().replace(/\s/g, '');

// Get number of dots and commas after cleanup of the multiples.
dots = getCharCount('\\.', formatted);
commas = getCharCount('\\,', formatted);

if (dots) {
const breakdown = value.split('.');
const last = breakdown[breakdown.length - 1];
const fraction = last.length;
// If the single dot separates something longer than 2 chars, then it's not a float.
if (fraction && fraction > 2 && !last.includes(',')) {
formatted = formatted.replace(/\./g, '');
}
} else if (commas) {
const breakdown = value.split(',');
const last = breakdown[breakdown.length - 1];
const fraction = last.length;
// Similarly, if the comma separates something longer than 2 chars, it's also not a float
if (fraction && fraction > 2 && !last.includes('.')) {
formatted = formatted.replace(/,/g, '');
}
// However, if the fraction is smaller, then the comma still needs to be replaced for numeral.
formatted = formatted.replace(/,/g, '.');
}
}

return formatted;
};

export const sanitizeValue = value => {
if (!value) return 0;

const sanitizedValue = Math.abs(numeral(value).value()) || 0;
const formatted = prepareValue(value);

const sanitizedValue = Math.abs(numeral(formatted).value()) || 0;

// Prevent long values
if (sanitizedValue > 9223372036854775807) return 0;
Expand Down
Loading

0 comments on commit 9a0e361

Please sign in to comment.