element_extract¶

Module to extract data from materials project database

Functions

`convexhull`(data)	Filters the compounds close to the convex hull.
`create_folder`(parent_folder)	Function to create a 'download' folder :param parent_folder: :type parent_folder: path to current working directory
`create_input`()	Reads 'download.csv' file inside 'download' folder and creates 'input.in' and 'mpid-list.in' files for further downloading and calculations.
`data_combine`(data1, data2)	Combines two pandas DataFrames into a single DataFrame.
`data_one_element_compound`(elm, ntype, ...)	Extracts information for compounds containing only one element.
`data_two_element_compound`(el1, el2, ntype, ...)	Extracts information for compounds containing two elements.
`download`(elm, num_el, exclude_el, properties)	Extracts various properties for compounds that satisfy certain criteria from the Materials Project database.
`download_by_entry`(entries, must_include[, ...])	Function to extract and create input files using "mp_api.client.MPRester.get_entries_in_chemsys" Function of the materials project API package (pip install mp_api).
`extract`(ntype, properties, elm, exclude_el)	Function to extract the data and apply filters, then write 'download.csv' file inside 'download' folder.
`main`()	Main function to orchestrate the data extraction and input file creation process.
`metal_filter`(data)	Filters metallic compounds from the input DataFrame.
`remove`(data, element_list)	Removes compounds containing specified elements from the DataFrame.
`stable`(data)	Filters the compounds for those having negative formation energy.

element_extract.create_folder(parent_folder)[source]¶: Function to create a ‘download’ folder :param parent_folder: :type parent_folder: path to current working directory

element_extract.download(elm, num_el, exclude_el, properties)[source]¶

Extracts various properties for compounds that satisfy certain criteria from the Materials Project database.

Parameters:¶

elmstr or list of str: Element(s) always to include in the compounds. For example, for hydrogen, elm = ‘H’. If multiple elements are desired, provide a list with up to size 2. For example, elm = [‘B’, ‘C’] for boron and carbon.
num_elint: Number of elements in the compound.
exclude_ellist of str: List of elements to exclude from the compound.
propertieslist of str: List of properties to extract.

Returns:¶

datapandas DataFrame: DataFrame containing the extracted data.

Example:¶

>>> download('H', 2, ['O', 'F'], ['material_id', 'formation_energy_per_atom'])

element_extract.stable(data)[source]¶

Filters the compounds for those having negative formation energy.

Parameters:¶

datapandas DataFrame: DataFrame containing information about compounds, including formation energy per atom.

Returns:¶

datapandas DataFrame: DataFrame containing compounds with negative formation energy per atom.

element_extract.convexhull(data)[source]¶

Filters the compounds close to the convex hull.

Parameters:¶

datapandas DataFrame: DataFrame containing information about compounds, including energy above hull.

Returns:¶

datapandas DataFrame: DataFrame containing compounds close to the convex hull.

element_extract.metal_filter(data)[source]¶

Filters metallic compounds from the input DataFrame.

Parameters:¶

datapandas DataFrame: DataFrame containing information about compounds, including band_gap.

Returns:¶

datapandas DataFrame: DataFrame containing metallic compounds (band gap <= 0.00001).

element_extract.data_combine(data1, data2)[source]¶

Combines two pandas DataFrames into a single DataFrame.

Parameters:¶

data1pandas DataFrame: The first DataFrame to be combined.
data2pandas DataFrame: The second DataFrame to be combined.

Returns:¶

datapandas DataFrame: Combined DataFrame containing data from both data1 and data2.

element_extract.remove(data, element_list)[source]¶

Removes compounds containing specified elements from the DataFrame.

Parameters:¶

datapandas DataFrame: The DataFrame containing compounds to be filtered.
element_liststr, optional: File with elements to exclude. Default is ‘remove.list’. The file should contain elements separated by commas. For example, to remove oxygen and nitrogen, write ‘O,N’ in ‘remove.list’.

Returns:¶

datapandas DataFrame: Processed DataFrame with compounds containing specified elements removed.

element_extract.data_one_element_compound(elm, ntype, exclude_el, properties)[source]¶

Extracts information for compounds containing only one element.

Parameters:¶

elmstr: Element to search for in compounds. For example, ‘B’ for boron.
ntypeint or tuple: Number of unique elements. Can be a single integer or a tuple (e.g., (1, 2) for 2 different types).
exclude_ellist: List of elements to exclude. For example, [‘O’, ‘N’].
propertieslist: List of properties to extract.

Returns:¶

datapandas DataFrame: DataFrame containing information for compounds with only one element.

element_extract.data_two_element_compound(el1, el2, ntype, exclude_el, properties)[source]¶

Extracts information for compounds containing two elements.

Parameters:¶

el1str: First element to search for in compounds (e.g., ‘B’ for boron).
el2str: Second element to search for in compounds (e.g., ‘C’ for carbon).
ntypeint or tuple: Number of unique elements in compounds. Can be a single integer or a tuple (e.g., (1, 2) for 2 different types).
exclude_ellist: List of elements to exclude. For example, [‘O’, ‘N’].
propertieslist: List of properties to extract.

Returns:¶

datapandas DataFrame: DataFrame containing information for compounds with two elements.

element_extract.create_input()[source]¶: Reads ‘download.csv’ file inside ‘download’ folder and creates ‘input.in’ and ‘mpid-list.in’ files for further downloading and calculations.

element_extract.extract(ntype, properties, elm, exclude_el, nelm=1, metal=False, neg_fe=False, thermo_stable=False, ordering='NM', nsites=10, spacegroup=None, out='download/download.csv')[source]¶

Function to extract the data and apply filters, then write ‘download.csv’ file inside ‘download’ folder.

Parameters:¶

ntypetuple: Number of unique elements in the compound. For example: (1, 3) for 3 different unique elements in compounds.
propertieslist: List of properties to extract.
elmlist: List of elements used in search.
exclude_ellist: List of elements to exclude.
nelmint, optional: Length of list elm. Default is 1.
metalbool, optional: True to download zero bandgap compounds. Default is False.
neg_febool, optional: True to download compounds with negative formation energy. Default is False.
thermo_stablebool, optional: True to download compounds at the convex hull. Default is False.
orderingstr, optional: Magnetic ordering of the compound. Default is ‘NM’.
nsitesint, optional: Maximum number of sites in the compound. Default is 10.
spacegroupint or str, optional: Spacegroup number or name. Default is None.
outstr, optional: Output file to write. Default is ‘download/download.csv’.

Returns:¶

datapandas DataFrame: Extracted data after applying filters.

element_extract.download_by_entry(entries, must_include, size_constraint=20, ntype_constraint=5, FE=False, thermo_stable=True, metal=False, magnetic=False, spacegroup=None, properties=None)[source]¶

Function to extract and create input files using “mp_api.client.MPRester.get_entries_in_chemsys” Function of the materials project API package (pip install mp_api). This mode is turned on when using ‘mode’:’chemsys’ in ‘download.py’ file.

Parameters:¶

entrieslist: List of elements ==> elements and compounds (combination of elements) to search.
size_constraintint, optional: Size of the compounds (total number of ions). Upper bound not included. Default is 20.
ntype_constraintint, optional: Number of different types of ions. Upper bound not included. Default is 5.
must_includelist: Elements that must be included in the compounds.
FEbool, optional: True if the formation energy is negative. Default is False.
metalbool, optional: True if the compound is a metal. Default is False.
magneticbool, optional: True if the compound has a non-zero magnetic moment. Default is False.
spacegroupint or str, optional: Spacegroup number or name. Default is None.
propertieslist, optional: List of properties to extract.

Returns:¶

None

element_extract.main()[source]¶

Main function to orchestrate the data extraction and input file creation process.

If ‘mpid-list.in’ file does not exist, the function reads settings from ‘config.json’ to create mpid-list.in file.

Parameters:¶

None

Returns:¶

None