element_extract¶
Module to extract data from materials project database
Functions
|
Filters the compounds close to the convex hull. |
|
Function to create a 'download' folder :param parent_folder: :type parent_folder: path to current working directory |
Reads 'download.csv' file inside 'download' folder and creates 'input.in' and 'mpid-list.in' files for further downloading and calculations. |
|
|
Combines two pandas DataFrames into a single DataFrame. |
|
Extracts information for compounds containing only one element. |
|
Extracts information for compounds containing two elements. |
|
Extracts various properties for compounds that satisfy certain criteria from the Materials Project database. |
|
Function to extract and create input files using "mp_api.client.MPRester.get_entries_in_chemsys" Function of the materials project API package (pip install mp_api). |
|
Function to extract the data and apply filters, then write 'download.csv' file inside 'download' folder. |
|
Main function to orchestrate the data extraction and input file creation process. |
|
Filters metallic compounds from the input DataFrame. |
|
Removes compounds containing specified elements from the DataFrame. |
|
Filters the compounds for those having negative formation energy. |
- element_extract.create_folder(parent_folder)[source]¶
Function to create a ‘download’ folder :param parent_folder: :type parent_folder: path to current working directory
- element_extract.download(elm, num_el, exclude_el, properties)[source]¶
Extracts various properties for compounds that satisfy certain criteria from the Materials Project database.
Parameters:¶
- elmstr or list of str
Element(s) always to include in the compounds. For example, for hydrogen, elm = ‘H’. If multiple elements are desired, provide a list with up to size 2. For example, elm = [‘B’, ‘C’] for boron and carbon.
- num_elint
Number of elements in the compound.
- exclude_ellist of str
List of elements to exclude from the compound.
- propertieslist of str
List of properties to extract.
Returns:¶
- datapandas DataFrame
DataFrame containing the extracted data.
Example:¶
>>> download('H', 2, ['O', 'F'], ['material_id', 'formation_energy_per_atom'])
- element_extract.stable(data)[source]¶
Filters the compounds for those having negative formation energy.
Parameters:¶
- datapandas DataFrame
DataFrame containing information about compounds, including formation energy per atom.
Returns:¶
- datapandas DataFrame
DataFrame containing compounds with negative formation energy per atom.
- element_extract.convexhull(data)[source]¶
Filters the compounds close to the convex hull.
Parameters:¶
- datapandas DataFrame
DataFrame containing information about compounds, including energy above hull.
Returns:¶
- datapandas DataFrame
DataFrame containing compounds close to the convex hull.
- element_extract.metal_filter(data)[source]¶
Filters metallic compounds from the input DataFrame.
Parameters:¶
- datapandas DataFrame
DataFrame containing information about compounds, including band_gap.
Returns:¶
- datapandas DataFrame
DataFrame containing metallic compounds (band gap <= 0.00001).
- element_extract.data_combine(data1, data2)[source]¶
Combines two pandas DataFrames into a single DataFrame.
Parameters:¶
- data1pandas DataFrame
The first DataFrame to be combined.
- data2pandas DataFrame
The second DataFrame to be combined.
Returns:¶
- datapandas DataFrame
Combined DataFrame containing data from both data1 and data2.
- element_extract.remove(data, element_list)[source]¶
Removes compounds containing specified elements from the DataFrame.
Parameters:¶
- datapandas DataFrame
The DataFrame containing compounds to be filtered.
- element_liststr, optional
File with elements to exclude. Default is ‘remove.list’. The file should contain elements separated by commas. For example, to remove oxygen and nitrogen, write ‘O,N’ in ‘remove.list’.
Returns:¶
- datapandas DataFrame
Processed DataFrame with compounds containing specified elements removed.
- element_extract.data_one_element_compound(elm, ntype, exclude_el, properties)[source]¶
Extracts information for compounds containing only one element.
Parameters:¶
- elmstr
Element to search for in compounds. For example, ‘B’ for boron.
- ntypeint or tuple
Number of unique elements. Can be a single integer or a tuple (e.g., (1, 2) for 2 different types).
- exclude_ellist
List of elements to exclude. For example, [‘O’, ‘N’].
- propertieslist
List of properties to extract.
Returns:¶
- datapandas DataFrame
DataFrame containing information for compounds with only one element.
- element_extract.data_two_element_compound(el1, el2, ntype, exclude_el, properties)[source]¶
Extracts information for compounds containing two elements.
Parameters:¶
- el1str
First element to search for in compounds (e.g., ‘B’ for boron).
- el2str
Second element to search for in compounds (e.g., ‘C’ for carbon).
- ntypeint or tuple
Number of unique elements in compounds. Can be a single integer or a tuple (e.g., (1, 2) for 2 different types).
- exclude_ellist
List of elements to exclude. For example, [‘O’, ‘N’].
- propertieslist
List of properties to extract.
Returns:¶
- datapandas DataFrame
DataFrame containing information for compounds with two elements.
- element_extract.create_input()[source]¶
Reads ‘download.csv’ file inside ‘download’ folder and creates ‘input.in’ and ‘mpid-list.in’ files for further downloading and calculations.
- element_extract.extract(ntype, properties, elm, exclude_el, nelm=1, metal=False, neg_fe=False, thermo_stable=False, ordering='NM', nsites=10, spacegroup=None, out='download/download.csv')[source]¶
Function to extract the data and apply filters, then write ‘download.csv’ file inside ‘download’ folder.
Parameters:¶
- ntypetuple
Number of unique elements in the compound. For example: (1, 3) for 3 different unique elements in compounds.
- propertieslist
List of properties to extract.
- elmlist
List of elements used in search.
- exclude_ellist
List of elements to exclude.
- nelmint, optional
Length of list elm. Default is 1.
- metalbool, optional
True to download zero bandgap compounds. Default is False.
- neg_febool, optional
True to download compounds with negative formation energy. Default is False.
- thermo_stablebool, optional
True to download compounds at the convex hull. Default is False.
- orderingstr, optional
Magnetic ordering of the compound. Default is ‘NM’.
- nsitesint, optional
Maximum number of sites in the compound. Default is 10.
- spacegroupint or str, optional
Spacegroup number or name. Default is None.
- outstr, optional
Output file to write. Default is ‘download/download.csv’.
Returns:¶
- datapandas DataFrame
Extracted data after applying filters.
- element_extract.download_by_entry(entries, must_include, size_constraint=20, ntype_constraint=5, FE=False, thermo_stable=True, metal=False, magnetic=False, spacegroup=None, properties=None)[source]¶
Function to extract and create input files using “mp_api.client.MPRester.get_entries_in_chemsys” Function of the materials project API package (pip install mp_api). This mode is turned on when using ‘mode’:’chemsys’ in ‘download.py’ file.
Parameters:¶
- entrieslist
List of elements ==> elements and compounds (combination of elements) to search.
- size_constraintint, optional
Size of the compounds (total number of ions). Upper bound not included. Default is 20.
- ntype_constraintint, optional
Number of different types of ions. Upper bound not included. Default is 5.
- must_includelist
Elements that must be included in the compounds.
- FEbool, optional
True if the formation energy is negative. Default is False.
- metalbool, optional
True if the compound is a metal. Default is False.
- magneticbool, optional
True if the compound has a non-zero magnetic moment. Default is False.
- spacegroupint or str, optional
Spacegroup number or name. Default is None.
- propertieslist, optional
List of properties to extract.
Returns:¶
None