Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

creating function main #4

Draft
wants to merge 18 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions diffpy/snmf/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,14 +44,16 @@ def initialize_variables(data_input, component_amount, data_type, sparsity=1, sm

component_matrix_guess = np.random.rand(signal_length, component_amount)
weight_matrix_guess = np.random.rand(component_amount, moment_amount)
stretching_matrix_guess = np.ones(component_amount, moment_amount) + np.random.randn(component_amount,
stretching_matrix_guess = np.ones((component_amount, moment_amount)) + np.random.randn(component_amount,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should probably not be in io.py. I would suggest to make functions somewhere in subroutines that initialize specific arrays then here write something like:

stretching_matrix_guess = initialize_arrays(number_of_components, number_of_moments)

or sthg like that.

In the function in subroutines.py, put into docstring what the arrays are and how the decision were made on how to do it. Currently the code is a bit too hard to read as it is....I think that by component_amount , based on earlier conversations, you mean number_of_components. I have no idea what moment_amount means, nor why you initialize it with ones and then add some kind of random component, so a docstring could be veyr helpful here (but in subroutines, not in io.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will create that function on a new branch. I will also rename the variables to make it more clear.

moment_amount) * 1e-3

diagonals = [np.ones(moment_amount - 2), -2 * np.ones(moment_amount - 2), np.ones(moment_amount - 2)]
smoothness_term = .25 * scipy.sparse.diags(diagonals, [0, 1, 2], shape=(moment_amount - 2, moment_amount))

hessian_helper_matrix = scipy.sparse.block_diag([smoothness_term.T @ smoothness_term] * component_amount)
sequence = np.arange(moment_amount * component_amount).reshape(component_amount, moment_amount).T.flatten()

hessian_helper_matrix = hessian_helper_matrix.tocsr()
hessian_helper_matrix = hessian_helper_matrix[sequence, :][:, sequence]

return {
Expand Down Expand Up @@ -102,7 +104,7 @@ def load_input_signals(file_path=None):
for item in directory_path.iterdir():
if item.is_file():
data = loadData(item.resolve())
if current_grid and current_grid != data[:, 0]:
if len(current_grid) != 0 and (current_grid != data[:, 0]).any():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment line would help here with the intent. I.e., why do we ignore and not handle if the data are on a different grid. If I remember correctly t here was a physics reason (mathematically it should be straightforward). Just recording for future people.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a comment line to explain this

print(f"{item.name} was ignored as it is not on a compatible grid.")
continue
else:
Expand Down
23 changes: 22 additions & 1 deletion diffpy/snmf/stretchednmfapp.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,26 @@
import numpy as np

from diffpy.snmf.io import load_input_signals, initialize_variables


def main():
print("Hello World!")
directory_path = input("Specify Path (Optional. Press enter to skip):")
aajayi-21 marked this conversation as resolved.
Show resolved Hide resolved
if not directory_path:
directory_path = None

data_type = input("Specify the data type ('xrd' or 'pdf'): ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can often be obtained from file header.

if data_type != 'xrd' and data_type != 'pdf':
raise ValueError("The data type must be 'xrd' or 'pdf'")

component_amount = input("\nEnter the amount of components to obtain:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably be "number" of components. "amount" sounds like a quantity of a continuous quantity, like amount of sand or amount of flour.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change the naming.

try:
component_amount = int(component_amount)
except TypeError:
raise TypeError("Please enter an integer greater than 0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put this in the input instructions rather than have the poor user have to enter quantities and have the program fail to find out they were supposed to put in an integer...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will do this


grid, data_input = load_input_signals(directory_path)
variables = initialize_variables(data_input, component_amount, data_type)
lifted_data = data_input - np.ndarray.min(data_input[:])


if __name__ == "__main__":
Expand Down