Project

General

Profile

Statistics
| Revision:

# Date Author Comment
36339 13/04/2015 01:34 PM Marek Horst

#1257 raising oozie.action.max.output.data to 8192

36291 09/04/2015 07:10 PM Marek Horst

#1257 dropping schema generation related hacks in all map-reduce modules, switching to literal schema parameters

35709 27/03/2015 09:44 AM Marek Horst

#1135 switching icm-iis-parent-container version to 1.0.1-SNAPSHOT in order to include workingDir related changes made in icm-iis-core

35701 27/03/2015 06:18 AM Mateusz Kobos

Removing usage of working_dir from Java workflow node.

35416 17/03/2015 03:04 PM Marek Horst

#1198 aligning IIS dependencies and java code to CDH5.3.0 cluster

35402 17/03/2015 03:01 PM Marek Horst

#1197 introducing job.properties changes aligning paths to rumcajs cluster HDFS structure

35259 11/03/2015 04:53 PM Marek Horst

creating IIS-CDH-5.3.0 branch

35258 11/03/2015 04:52 PM Marek Horst

introducing branches folder

34945 02/03/2015 01:18 PM Marek Horst

updating job.properties

34693 20/02/2015 07:16 PM Marek Horst

#1133 dropping useless workfing_dir creation for java nodes

34615 19/02/2015 06:12 PM Marek Horst

#1038 introducing ranges in dependencies definition for all IIS modules

33593 16/12/2014 05:15 PM Marek Horst

[maven-release-plugin] prepare for next development iteration

33592 16/12/2014 05:14 PM Marek Horst

[maven-release-plugin] copy for tag icm-iis-ingest-pmc-1.0.0

33591 16/12/2014 05:14 PM Marek Horst

[maven-release-plugin] prepare release icm-iis-ingest-pmc-1.0.0

33590 16/12/2014 05:09 PM Marek Horst

#1044 pre-release switching to released version of parent pom and released dependencies

33413 15/12/2014 12:45 PM Marek Horst

introducing scm definition

33370 12/12/2014 04:19 PM Marek Horst

#1038 changing ceon-scala-commons 0.0.2-SNAPSHOT dependency to released 0..0.2

33367 12/12/2014 03:32 PM Marek Horst

#1038 dependency cleanup: removing obsolete dnet-openaireplus-mapping-utils dependency

33133 02/12/2014 02:54 PM Marek Horst

replacing non standard dash character to '-'

33131 02/12/2014 12:48 PM Marek Horst

replacing non standard dash character to '-'

33130 02/12/2014 10:42 AM Marek Horst

fixing test run on jenkins: seting encoding explicitly to utf8

33125 01/12/2014 09:06 PM Marek Horst

#1017 fixing expected citations

33123 01/12/2014 07:40 PM Marek Horst

#1017 fixing PMC and DOI identifiers retrieval from avro map: addressing by Utf8 objects not by String

33104 28/11/2014 06:13 PM Marek Horst

#1017 accepting ExtractedDocumentMetadata instead of DocumentText at PMC citation ingestion input. Aliging integration test and importer workflow.

32942 21/11/2014 05:50 PM Marek Horst

#1017 introducing new PMC metadata ingestion currently extracing references, journal and pages fields.
Replacing DOM/XPath based citations ingestion with much faster SAX version. Changing pmidtooaid transformer utilizing ExtractedDocumentMetadata instead of parsing XML file. Enabling PMC metadata ingestion in common/import.

32324 07/11/2014 02:57 PM Marek Horst

#955 fixing reference raw text generation for pretty printed NLM documents

32242 05/11/2014 05:32 PM Marek Horst

introducing embedded integration test entry

31234 08/10/2014 07:45 PM Marek Horst

#840 renaming DeduplicationMapping to more generic IdentifierMapping

31225 08/10/2014 06:19 PM Marek Horst

#840 moving IdentifierMapping from importer to common package

31218 08/10/2014 06:12 PM Marek Horst

#840 renaming DeduplicationMapping to more generic IdentifierMapping

31117 06/10/2014 01:20 PM Marek Horst

#757 adding reducing phase for filtering out pmids by article type, mapping phase groups PmidMapping objects by pmid and at reducer phase duplicates will be filtered out

31116 06/10/2014 01:18 PM Marek Horst

#757 introducing article type extraction along with unit test. Article type will be required for filtering out pmc duplicates and leaving only proper types

31035 02/10/2014 02:29 PM Marek Horst

introducing cloudera repository in parent container, removing repository definitions from individual IIS modules

31031 02/10/2014 01:44 PM Marek Horst

fixing sourceDocumentId which is now extracted from input DocumentText record conveying NLM

31023 02/10/2014 01:08 PM Marek Horst

#757 fixing pmc citation matching test by providing proper input

31022 02/10/2014 01:08 PM Marek Horst

#757 fixing pmc citation matching test by providing proper input

30987 01/10/2014 06:38 PM Marek Horst

#757 fixing pmid and doi matching, fixing sourceDocumentId and destinationDocumentId generation

30986 01/10/2014 06:37 PM Marek Horst

#757 fixing pmid and doi matching, fixing sourceDocumentId and destinationDocumentId generation

30804 22/09/2014 08:25 AM Michal Oniszczuk

Commented out test in a stub of a solution to the task #576: Ingestion of metadata from EuropePMC.

30802 20/09/2014 02:19 PM Michal Oniszczuk

Stub of a solution to the task #576: Ingestion of metadata from EuropePMC.

30801 20/09/2014 02:18 PM Michal Oniszczuk

Refactored code to use the XPathEvaluator.fromString method.

30418 17/09/2014 11:06 AM Sandro La Bruzzo

created tag folder for release

30145 12/09/2014 03:16 PM Marek Horst

updating default job properties

29631 28/07/2014 09:45 PM Marek Horst

renaming workflow to ingest_pmc_plaintext

29390 21/07/2014 11:54 AM Mateusz Kobos

Excluding conflicting dependency

29097 14/07/2014 03:58 PM Marek Horst

replacing "result" string with Type.result.name()

28990 10/07/2014 04:15 PM Marek Horst

updating job.properties

28973 09/07/2014 05:55 PM mateusz.fedoryszak

dir names in parameters should not contain nameNode

28931 07/07/2014 05:52 PM mateusz.fedoryszak

rename a field

28768 01/07/2014 05:04 PM Marek Horst

introducing deploy.info file for module icm-iis-ingest-pmc