'분류 전체보기' 카테고리의 글 목록

Channel-wise dropout 코드 구현

CustomDropout이 원래 dropout - nn.Dropout과 학습곡선 동일하게 나옴ChannelDropout이 바꾼거 성능은... 단순 CNN에 넣으니까 둘다 잘 안나오는데 음.... class CustomDropout(nn.Module): def __init__(self, p: float = 0.5): super(CustomDropout, self).__init__() self.p = p def forward(self, x): if self.training: mask = (torch.rand_like(x) > self.p).float() return x * mask / (1 - self.p) ..

코드 2024.12.29

Pre-trained Resnet with Flowers102(Transfer Learning)

시작1. 앞의 모델 기본구조 및 개념 동일2. flowers102 모델에 적용해서 실행3. 데이터 갯수 : train 1020, test 6149개 ?????????? * label별로 찍어봤을때 정확하게 102개의 클래스가 10개씩 갖고있음. few-shot learning 용인가ChatGPT 답변결론Flowers102 데이터셋의 Train 데이터보다 Test 데이터가 많은 이유는:1. 데이터셋이 성능 평가를 주목적으로 설계되었기 때문.2. 적은 Train 데이터로도 모델이 학습을 잘 수행하고 일반화할 수 있는지를 실험하기 위해.4. 일단 튜닝하니까 돌아는간다. 값도 일단 나오긴함! train이 워낙 작다보니 epoch 늘려도 될듯 기본모델(base) epoch 301. train 91.8%, tes..

코드 2024.12.29

Pre-trained Resnet with CIFAR-10(Transfer Learning) 2탄

코드(깃허브)https://github.com/LGMpr/Pre-trained-Resnet-with-CIFAR-10-Transfer-Learning-/blob/main/aa Base(epoch 10) 1. train 85.9%, test 82.7% ** epoch 100짜리는 test 85.3% Learning Scheduler(CosineAnnealingLR) (New mod1)1. train 84.1%, test 80.4% 2. 해석 : Batch문에 넣다보니 너무 빠르게 아닐까. epoch별로 넣어보기. 그리고 사실 만든 이유를 고려하면 epoch 10 정도에서 유의미한 효과를 얻기는 어려울 것.3. epoch별로 넣었을때(mod1.1) : train 85.9%, test 82.5%4. ..

코드 2024.12.28

Pre-trained Resnet with CIFAR-10(Transfer Learning)

코드(깃허브)https://github.com/LGMpr/Pre-trained-Resnet-with-CIFAR-10-Transfer-Learning-/blob/main/aa Pre-trained-Resnet-with-CIFAR-10-Transfer-Learning-/aa at main · LGMpr/Pre-trained-Resnet-with-CIFAR-10-Transfer-Learning-Contribute to LGMpr/Pre-trained-Resnet-with-CIFAR-10-Transfer-Learning- development by creating an account on GitHub.github.com 들어가며1. 따라한 모델 : https://velog.io/@pppanghyun/%EC%A..

코드 2024.12.28

pytorch datasets

설명데이터 수(train, test)클래스갯수이미지크기비고CIFAR-10다양60,000(50k, 10k)1032x32 CIFAR-100다양60,000(50k, 10k)10032x32 ImageNet 1,350,000(1.2M, 100k)1000 val데이터 5만개MNIST손글씨, 0~970,000(60k, 10k)1028x28흑백Fashion-MNIST손글씨, 의류70,000(60k, 10k)1028x28흑백Caltech101다양8,677101 이미지 인식용Omniglot20명 손글씨, 1,623글자, 50종 알파벳 One(few)-shot learningceleb얼굴속성200,000?? Flowers102영국 꽃8,189102 다양함비슷한 클래스 多 Cifar10.data ## train 데이터..

코드 2024.12.10

Layer Normalization 리뷰

ProblemHow to reduce the training time : normalize - BNBN’s disadvantage : dependent on the mini-batch size, not obvious RNNRNN require different statistics for different time-steps Layer Normalization정의 : ????BN: normalizes the summed inputs to each hidden unittranspose BN into LN by computing in a layer on a single training caseall the hidden units in a layer share the same normalization terms..

논문 리뷰 2024.12.04

Batch Normalization: Accelerating Deep Network Training by ReducingInternal Covariate Shift 리뷰

ProblemCovariate shift각 레이어마다 입력의 분포가 변함the distribution of each layer’s inputs changes during training레이어가 계속 새 분포에 적응해야하므로 문제 발생The change in the distributions of layers’ inputs presents a problem because the layers need to continuously adapt to the new distributionSaturating non-linearity, vanishing gradient addressed by ReLU, careful initialization, small learning rates. But BN is more stabl..

논문 리뷰 2024.12.04

MNIST 홀짝분류 코드 구현 및 분석(pytorch)

1. 데이터 - MNIST * from torchvision. 28 x 28 image - batch size : 64 - shuffle : True - Normalize : (0.5, 0.5) 2. 기본 모델 fc1fc2fc3layer784, 128128, 6464, 1actviationrelurelusigmoid - lr : 0.001 - loss : BCELoss - optimizer : Adam - epoch : 2 ## GPU 미사용 했으므로 연산시간을 짧게 하기위해 3. 측정 지표 - train loss/accuracy - test loss/accuracy * 가장 중요한건 test accuracy 4. 실험 결과 5. 결론/느낀점 가. CNN, ..

코드 2024.12.04

Resnet(Deep Residual Learning for Image Recognition) 리뷰

주요 내용1. 기본 개념 - 항등(에 근접한) 연산. 입출력이 거의 안바뀐다는 preconditioning. F(x)는 0행렬에 가까울 것. W initial을 0 평균으로 할테니까 최적화도 훨씬 쉬움- Bottle neck 2. 개념 / 용어- residual : 잔차. 회귀분석에서 실제값 - 추정값. - the challenging ImageNet dataset all exploit “very deep” models with a depth of sixteen to thirty - COCO : 데이터셋 이름. 주로 Object Detection(객체 검출, 감지), 이미지 분류- saturate : 활성화 함수의 출력이 최대값이나 최소값에 가까워지는 현상(기울기가 0에 가까워 지는 것) * non-s..

논문 리뷰 2024.12.01

[MML] 4. Matrix Decompositions % ★★★

Matrix Decompositions 행렬 분해 - Determinant(행렬식) : ad - bc. det(A) 또는 $\left| A\right|$ 로 표현. 정사각행렬에서만 가능 역행렬 존재하는지 판별해주는 역할(행렬의 특성을 결정해줌)- 기하학적으로 "부피"를 구하는 것과 동일함- sarrus' rule : 주로 3차 이상 행렬식 구하는 공식 ## 행렬식 곱할때 그냥 규칙성 찾은거(대각선으로 이동하면서)- Laplace expansion : (n−1)×(n−1) 행렬의 행렬식 계산으로 축소(치환, reduction) * 행렬식 구하면 sarru's rule = Laplace expansion - - 대각합 * tr(A)..

카테고리 없음 2024.11.26

머신러닝, 딥러닝

분류 전체보기 17

티스토리툴바

« 2026/01 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31