element_extract

Module to extract data from materials project database

Functions

convexhull(data)

Filters the compounds close to the convex hull.

create_folder(parent_folder)

Function to create a 'download' folder :param parent_folder: :type parent_folder: path to current working directory

create_input()

Reads 'download.csv' file inside 'download' folder and creates 'input.in' and 'mpid-list.in' files for further downloading and calculations.

data_combine(data1, data2)

Combines two pandas DataFrames into a single DataFrame.

data_one_element_compound(elm, ntype, ...)

Extracts information for compounds containing only one element.

data_two_element_compound(el1, el2, ntype, ...)

Extracts information for compounds containing two elements.

download(elm, num_el, exclude_el, properties)

Extracts various properties for compounds that satisfy certain criteria from the Materials Project database.

download_by_entry(entries, must_include[, ...])

Function to extract and create input files using "mp_api.client.MPRester.get_entries_in_chemsys" Function of the materials project API package (pip install mp_api).

extract(ntype, properties, elm, exclude_el)

Function to extract the data and apply filters, then write 'download.csv' file inside 'download' folder.

main()

Main function to orchestrate the data extraction and input file creation process.

metal_filter(data)

Filters metallic compounds from the input DataFrame.

remove(data, element_list)

Removes compounds containing specified elements from the DataFrame.

stable(data)

Filters the compounds for those having negative formation energy.

element_extract.create_folder(parent_folder)[source]

Function to create a ‘download’ folder :param parent_folder: :type parent_folder: path to current working directory

element_extract.download(elm, num_el, exclude_el, properties)[source]

Extracts various properties for compounds that satisfy certain criteria from the Materials Project database.

Parameters:

elmstr or list of str

Element(s) always to include in the compounds. For example, for hydrogen, elm = ‘H’. If multiple elements are desired, provide a list with up to size 2. For example, elm = [‘B’, ‘C’] for boron and carbon.

num_elint

Number of elements in the compound.

exclude_ellist of str

List of elements to exclude from the compound.

propertieslist of str

List of properties to extract.

Returns:

datapandas DataFrame

DataFrame containing the extracted data.

Example:

>>> download('H', 2, ['O', 'F'], ['material_id', 'formation_energy_per_atom'])
element_extract.stable(data)[source]

Filters the compounds for those having negative formation energy.

Parameters:

datapandas DataFrame

DataFrame containing information about compounds, including formation energy per atom.

Returns:

datapandas DataFrame

DataFrame containing compounds with negative formation energy per atom.

element_extract.convexhull(data)[source]

Filters the compounds close to the convex hull.

Parameters:

datapandas DataFrame

DataFrame containing information about compounds, including energy above hull.

Returns:

datapandas DataFrame

DataFrame containing compounds close to the convex hull.

element_extract.metal_filter(data)[source]

Filters metallic compounds from the input DataFrame.

Parameters:

datapandas DataFrame

DataFrame containing information about compounds, including band_gap.

Returns:

datapandas DataFrame

DataFrame containing metallic compounds (band gap <= 0.00001).

element_extract.data_combine(data1, data2)[source]

Combines two pandas DataFrames into a single DataFrame.

Parameters:

data1pandas DataFrame

The first DataFrame to be combined.

data2pandas DataFrame

The second DataFrame to be combined.

Returns:

datapandas DataFrame

Combined DataFrame containing data from both data1 and data2.

element_extract.remove(data, element_list)[source]

Removes compounds containing specified elements from the DataFrame.

Parameters:

datapandas DataFrame

The DataFrame containing compounds to be filtered.

element_liststr, optional

File with elements to exclude. Default is ‘remove.list’. The file should contain elements separated by commas. For example, to remove oxygen and nitrogen, write ‘O,N’ in ‘remove.list’.

Returns:

datapandas DataFrame

Processed DataFrame with compounds containing specified elements removed.

element_extract.data_one_element_compound(elm, ntype, exclude_el, properties)[source]

Extracts information for compounds containing only one element.

Parameters:

elmstr

Element to search for in compounds. For example, ‘B’ for boron.

ntypeint or tuple

Number of unique elements. Can be a single integer or a tuple (e.g., (1, 2) for 2 different types).

exclude_ellist

List of elements to exclude. For example, [‘O’, ‘N’].

propertieslist

List of properties to extract.

Returns:

datapandas DataFrame

DataFrame containing information for compounds with only one element.

element_extract.data_two_element_compound(el1, el2, ntype, exclude_el, properties)[source]

Extracts information for compounds containing two elements.

Parameters:

el1str

First element to search for in compounds (e.g., ‘B’ for boron).

el2str

Second element to search for in compounds (e.g., ‘C’ for carbon).

ntypeint or tuple

Number of unique elements in compounds. Can be a single integer or a tuple (e.g., (1, 2) for 2 different types).

exclude_ellist

List of elements to exclude. For example, [‘O’, ‘N’].

propertieslist

List of properties to extract.

Returns:

datapandas DataFrame

DataFrame containing information for compounds with two elements.

element_extract.create_input()[source]

Reads ‘download.csv’ file inside ‘download’ folder and creates ‘input.in’ and ‘mpid-list.in’ files for further downloading and calculations.

element_extract.extract(ntype, properties, elm, exclude_el, nelm=1, metal=False, neg_fe=False, thermo_stable=False, ordering='NM', nsites=10, spacegroup=None, out='download/download.csv')[source]

Function to extract the data and apply filters, then write ‘download.csv’ file inside ‘download’ folder.

Parameters:

ntypetuple

Number of unique elements in the compound. For example: (1, 3) for 3 different unique elements in compounds.

propertieslist

List of properties to extract.

elmlist

List of elements used in search.

exclude_ellist

List of elements to exclude.

nelmint, optional

Length of list elm. Default is 1.

metalbool, optional

True to download zero bandgap compounds. Default is False.

neg_febool, optional

True to download compounds with negative formation energy. Default is False.

thermo_stablebool, optional

True to download compounds at the convex hull. Default is False.

orderingstr, optional

Magnetic ordering of the compound. Default is ‘NM’.

nsitesint, optional

Maximum number of sites in the compound. Default is 10.

spacegroupint or str, optional

Spacegroup number or name. Default is None.

outstr, optional

Output file to write. Default is ‘download/download.csv’.

Returns:

datapandas DataFrame

Extracted data after applying filters.

element_extract.download_by_entry(entries, must_include, size_constraint=20, ntype_constraint=5, FE=False, thermo_stable=True, metal=False, magnetic=False, spacegroup=None, properties=None)[source]

Function to extract and create input files using “mp_api.client.MPRester.get_entries_in_chemsys” Function of the materials project API package (pip install mp_api). This mode is turned on when using ‘mode’:’chemsys’ in ‘download.py’ file.

Parameters:

entrieslist

List of elements ==> elements and compounds (combination of elements) to search.

size_constraintint, optional

Size of the compounds (total number of ions). Upper bound not included. Default is 20.

ntype_constraintint, optional

Number of different types of ions. Upper bound not included. Default is 5.

must_includelist

Elements that must be included in the compounds.

FEbool, optional

True if the formation energy is negative. Default is False.

metalbool, optional

True if the compound is a metal. Default is False.

magneticbool, optional

True if the compound has a non-zero magnetic moment. Default is False.

spacegroupint or str, optional

Spacegroup number or name. Default is None.

propertieslist, optional

List of properties to extract.

Returns:

None

element_extract.main()[source]

Main function to orchestrate the data extraction and input file creation process.

If ‘mpid-list.in’ file does not exist, the function reads settings from ‘config.json’ to create mpid-list.in file.

Parameters:

None

Returns:

None