-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#3] 입력 데이터를 처리하기 위한 namedtuple 구현 #6
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[공통 코멘트]
- 파라미터에 대한 데이터형 힌트 추가 부탁드립니다
- 모듈, 클래스, 함수에 대한 DocString 추가 부탁드립니다.
CATS/inputs.py
Outdated
group_name=DEFAULT_GROUP_NAME): | ||
if embedding_name is None: | ||
embedding_name = name | ||
elif embedding_name == 'auto': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
embedding_name == 'auto'
가 의도하신 부분이 맞으실까요? 해당 조건문 하위에서 동작하는 로직이 임베딩차원과 관련되어 보여서
elif embedding_dim == 'auto':
가 아닌가 해서요!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
넵 오타네요..ㅠ 수정하겠습니다.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
오타 수정완료하였습니다. 212ec0d
CATS/inputs.py
Outdated
'group_name'])): | ||
__slots__ = () | ||
|
||
def __new__(cls, name, vocabulary_size, embedding_dim=4, use_hash=False, dtype="int32", embedding_name=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
함수의 param에 데이터형 힌트 추가 해주세요! 해당 함수(__new__
)뿐만 아니라 모든 코드에 적용 부탁드립니다.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
넵 모든코드에 적용하도록 하겠습니다. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SparseFeat 클래스 내 new() 함수 param에 데이터형 힌트 추가 완료하였습니다. b08ff67
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VarLenSparseFeat 클래스 내 new()함수 param에 데이터형 힌트 추가 완료하였습니다. cb328a4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DenseFeat 클래스 내 new()함수 param에 데이터형 힌트 추가 완료하였습니다. b77a4f4
class SparseFeat(namedtuple('SparseFeat', | ||
['name', 'vocabulary_size', 'embedding_dim', 'use_hash', 'dtype', 'embedding_name', | ||
'group_name'])): | ||
__slots__ = () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__slots__
의 용처와 사용법이 어떻게 되는지 설명해 주실 수 있으실까요?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
기존 클래스는 instance를 생성하면 dictionary에 키와 값을 저장합니다. dictionary를 사용했을 때, 장점은 멤버변수를 동적으로 추가하기 쉽다는 것입니다. 하지만 단점은 멤버변수가 동적이기 때문에 잘못된 멤버변수를 사용하여 오류가 날 수 있고 객체 마다 별도의 메모리를 관리해야하기 때문에 메모리 오버헤드가 큽니다.
__slots__는 __dict__와 다르게 static한 멤버변수를 가지기 때문에 고정된 메모리 공간을 사용하여 효율적인 메모리 관리가 가능합니다. 그렇기 때문에 객체 내 저장될 데이터의 수가 많다면 __slots__를 사용하여 객체를 관리하는 것이 효율적입니다.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
�그럼 nametuple 역시 불변 객체를 만드는데요. 명시적으로 __slots__()
을 적어주신 추가적인 이유가 있으신지 궁금합니다!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__slots__를 ()로 선언하게되면 동적으로 속성값을 변경할 수 없게되며 메모리 효율이 증가합니다. 그렇기 때문에 현재 객체내 멤버변수값을 동적으로 바꿀 필요가 없는경우 메모리 효율을 위해 slots=()를 작성하였습니다.
CATS/inputs.py
Outdated
['sparsefeat', 'maxlen', 'combiner', 'length_name'])): | ||
__slots__ = () | ||
|
||
def __new__(cls, sparsefeat, maxlen, combiner="mean", length_name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
combiner
종류를 str
형으로 되어 있어서, 사용자가 combiner
로 사용 가능한 값이 어떤 것인지 인지하기 쉽지 않습니다. Enum
또는 Literal
로 입력 가능한 값을 기정의하는 방식으로 처리하면 좋을 것 같습니다. (참고)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
넵 수정하겠습니다.
CATS/inputs.py
Outdated
def __new__(cls, name, dimension=1, dtype="float32"): | ||
return super(DenseFeat, cls).__new__(cls, name, dimension, dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dimension
값에 대한 유효성 검증이 추가되면 좋겠습니다.
def __new__(cls, name, dimension=1, dtype="float32"): | |
return super(DenseFeat, cls).__new__(cls, name, dimension, dtype) | |
def __new__(cls, name, dimension=1, dtype="float32"): | |
if dimension < 0 and not isinstance(dimension, int): | |
raise ValueError(...) | |
return super(DenseFeat, cls).__new__(cls, name, dimension, dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
넵 수정하도록 하겠습니다.😄
코멘트 달아주신 내용 수정하여 commit 하였습니다. 검토 부탁드립니다.🙏🏻 |
@22ema 몇가지 안내드릴 사항이 있어서 내일 멘토링 시간에 구두로 얘기드리도록 하겠습니다!
|
넵! |
DEFAULT_GROUP_NAME = "default_group" | ||
|
||
|
||
class SparseFeat(namedtuple('SparseFeat', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
가이드를 드리기위한 코멘트 (임시)
This reverts commit 02b7879.
What is this PR 🔍
Changes 🔑
To Reviewer 🙇♂️