-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
creating function main #4
base: main
Are you sure you want to change the base?
Changes from 6 commits
36fce4d
2ffed0c
42dac41
dfa4910
b982382
6d8be90
6fb2879
07cd3ec
1be46c2
32bf269
e527a19
865251e
8f6b94e
a174475
8b26f4f
a024bb5
1331805
7d4c97f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -44,14 +44,16 @@ def initialize_variables(data_input, component_amount, data_type, sparsity=1, sm | |
|
||
component_matrix_guess = np.random.rand(signal_length, component_amount) | ||
weight_matrix_guess = np.random.rand(component_amount, moment_amount) | ||
stretching_matrix_guess = np.ones(component_amount, moment_amount) + np.random.randn(component_amount, | ||
stretching_matrix_guess = np.ones((component_amount, moment_amount)) + np.random.randn(component_amount, | ||
moment_amount) * 1e-3 | ||
|
||
diagonals = [np.ones(moment_amount - 2), -2 * np.ones(moment_amount - 2), np.ones(moment_amount - 2)] | ||
smoothness_term = .25 * scipy.sparse.diags(diagonals, [0, 1, 2], shape=(moment_amount - 2, moment_amount)) | ||
|
||
hessian_helper_matrix = scipy.sparse.block_diag([smoothness_term.T @ smoothness_term] * component_amount) | ||
sequence = np.arange(moment_amount * component_amount).reshape(component_amount, moment_amount).T.flatten() | ||
|
||
hessian_helper_matrix = hessian_helper_matrix.tocsr() | ||
hessian_helper_matrix = hessian_helper_matrix[sequence, :][:, sequence] | ||
|
||
return { | ||
|
@@ -102,7 +104,7 @@ def load_input_signals(file_path=None): | |
for item in directory_path.iterdir(): | ||
if item.is_file(): | ||
data = loadData(item.resolve()) | ||
if current_grid and current_grid != data[:, 0]: | ||
if len(current_grid) != 0 and (current_grid != data[:, 0]).any(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think a comment line would help here with the intent. I.e., why do we ignore and not handle if the data are on a different grid. If I remember correctly t here was a physics reason (mathematically it should be straightforward). Just recording for future people. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will add a comment line to explain this |
||
print(f"{item.name} was ignored as it is not on a compatible grid.") | ||
continue | ||
else: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
import numpy as np | ||
import argparse | ||
|
||
from diffpy.snmf.io import load_input_signals, initialize_variables | ||
|
||
|
||
def create_parser(): | ||
parser = argparse.ArgumentParser( | ||
prog="stretched_nmf", | ||
description="Stretched Nonnegative Matrix Factorization" | ||
) | ||
parser.add_argument('-v', '--version', action='version', help='Print the software version number') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. put this at the bottom, it will be rarely used. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will do that. |
||
parser.add_argument('-d', '--directory', type=str, | ||
help="Directory containing experimental data. Ensure it is in quotations or apostrophes.") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why does it have to be in quotes? Call it maybe I think it will end up being a good idea to add an argument There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. by default, the default value of an optional argument is None (so it would be redundant to explicitly set that). The load_input_signal function handles the case where it is None as you described. When I was using it, the cli wouldn't work when I specified a directory and it wasn't in quotes. I can see if I can fix this in the code. I will rename the variables and add an argument as you described. Should I make a new function in io.py to handle dumping the results in a new directory? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, sounds good. The most important thing is that the users know what is happening. They don't need to know "the default value is None and this is handled in the code" they need to know the arg is optional and the default behavior is that it uses the current working directory. It is always a good idea to make your intentions clearer for other devs who come later, so I could definitely argue in favor of explicitly defining default to be None. It just reduces cognitive overload on the person coming later....they don't have to scratch their head and think, "what is going on?" and having to remember what the default behavior of argparse is.... |
||
|
||
parser.add_argument('component_number', type=int, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suggest maybe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. I will rename it here and in the rest of the code. |
||
help="The number of component signals to obtain from experimental " | ||
"data. Must be an integer greater than 0.") | ||
parser.add_argument('data_type', type=str, choices=['xrd', 'pdf'], help="The type of the experimental data.") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in the cli, always use a For this it is often present in the file header, so maybe better to make this optional, but raise an exception and ask the user to rerun specifying it if you can't find it in the file header. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand. I will make it an optional argument and rename it as you described. |
||
args = parser.parse_args() | ||
return args | ||
|
||
|
||
def main(): | ||
args = create_parser() | ||
|
||
grid, data_input = load_input_signals(args.directory) | ||
variables = initialize_variables(data_input, args.component_number, args.data_type) | ||
lifted_data = data_input - np.ndarray.min(data_input[:]) | ||
return lifted_data | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should probably not be in io.py. I would suggest to make functions somewhere in subroutines that initialize specific arrays then here write something like:
or sthg like that.
In the function in subroutines.py, put into docstring what the arrays are and how the decision were made on how to do it. Currently the code is a bit too hard to read as it is....I think that by
component_amount
, based on earlier conversations, you meannumber_of_components
. I have no idea whatmoment_amount
means, nor why you initialize it with ones and then add some kind of random component, so a docstring could be veyr helpful here (but in subroutines, not in io.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will create that function on a new branch. I will also rename the variables to make it more clear.