-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(data preprocess): remove the cut off options from info.json #200
Conversation
…and collect the values from input.json
Previous the ase data will be transferred into text file and then loaded by the _TrajData. now i refactor the function. both text and ase data are treated equally. will works as a class funtion to initial the _TrajData class.
…om_model_options .
…g96. For powerlaw and varTang96, the rs is not exactly the hard cutoff. so when extract the r_max for data. we have to use rs + 5 * w; but for other method just use rs.
…s instance and add from_model class function. note, compared to the previous build_dataset, this one is more flexible. previous build_dataset is a function. now i define a class DataBuilder and re-defined __call__ function. then build_dataset is an instance of DataBuilder class. so i can use build_dataset.from_model() to build dataset from model. at the same time the previous way to use build_dataset is still available. like build_dataset(...).
@@ -500,7 +500,7 @@ def from_points( | |||
def from_ase( | |||
cls, | |||
atoms, | |||
r_max, | |||
r_max: Union[float, int, dict], | |||
er_max: Optional[float] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这边er_max和oer_max不要同步改下嘛?
# same cell size, then copy it to all frames. | ||
cell = np.expand_dims(cell, axis=0) | ||
data["cell"] = np.broadcast_to(cell, (info["nframes"], 3, 3)) | ||
elif cell.shape[0] == info["nframes"] * 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nframes现在是保留在info里的?
pos = np.loadtxt(os.path.join(root, "positions.dat")) | ||
if len(pos.shape) == 1: | ||
pos = pos.reshape(1,3) | ||
natoms = info["natoms"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok看起来nframes和natoms的逻辑是没动的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nframes natoms 如果想去掉。就必须修改文本数据的存储格式。比如一帧结构存为一行这样。不然没办法从数据中提取这个信息。因此我没办法去掉。
@@ -170,7 +170,7 @@ def __init__(self, model:torch.nn.Module, results_path: str=None, use_gui: bool= | |||
self.results_path = results_path | |||
self.use_gui = use_gui | |||
|
|||
def get_bands(self, data: Union[AtomicData, ase.Atoms, str], kpath_kwargs: dict, AtomicData_options: dict={}): | |||
def get_bands(self, data: Union[AtomicData, ase.Atoms, str], kpath_kwargs: dict, pbc:Union[bool,list]=None, AtomicData_options:dict=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里为啥单加pbc,以及,前面数据部分AtomicData_options 被info取代掉了,这里为啥保留?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
单加pbc 是因为这信息不一定能从给的结构文件中提取到。需要支持外部的指定。 这里支持AtomicData_options 是为了兼容以前的一些存档。这个后续使用是可以不提供。但是对于一些旧存档,就必须加上,不然存档不能用。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
而数据部分取消,是因为以后做训练任务,我们就不需要这个了。以后新训练下来的模型,后处理算能带的时候,也可以不提供这个。这个参数现在是 optional的。
这一切都是为了软件的兼容性,所不得不做的设置。
refactor(data preprocess): remove the cut off options from info.json and collect the values from input.json. when run model no need to supply the atomicdata options.
Fix: #155