[Python] Week 12: 문자열

Day 1: 정규표현식의 기본 개념

강의 내용:
- 정규표현식의 정의와 특징
  - 정규표현식의 개념과 용도
  - 정규표현식의 기본 패턴
- re 모듈 소개
  - 파이썬의 re 모듈
  - re 모듈의 주요 함수 (search, match, findall 등)
실습:
- re 모듈을 사용한 간단한 정규표현식 예제

import re

# 간단한 정규표현식 예제
pattern = r"abc"
text = "abcdef"
match = re.search(pattern, text)

if match:
    print("매칭됨:", match.group())  # 'abc'
else:
    print("매칭되지 않음")

Day 2: 정규표현식의 기본 패턴 I

강의 내용:
- 기본 메타문자
  - . : 임의의 한 문자
  - ^ : 문자열의 시작
  - $ : 문자열의 끝
  - * : 0회 이상 반복
  - - : 1회 이상 반복
  - ? : 0회 또는 1회
- 문자 클래스
  - [] : 문자 클래스 정의
  - [^] : 부정 문자 클래스
실습:
- 기본 메타문자와 문자 클래스를 사용한 예제

import re

# 기본 메타문자 예제
pattern = r"b.n"
text = "bat bin boon"
matches = re.findall(pattern, text)
print(matches)  # ['bat', 'bin', 'bon']

# 문자 클래스 예제
pattern = r"[aeiou]"
text = "hello"
matches = re.findall(pattern, text)
print(matches)  # ['e', 'o']

# 부정 문자 클래스 예제
pattern = r"[^aeiou]"
text = "hello"
matches = re.findall(pattern, text)
print(matches)  # ['h', 'l', 'l']

Day 3: 정규표현식의 기본 패턴 II

강의 내용:
- 반복 패턴
  - {m} : 정확히 m회 반복
  - {m,n} : m회 이상, n회 이하 반복
- 그룹핑과 캡처
  - () : 그룹핑
  - \number : 캡처된 그룹 참조
- OR 연산자
  - | : OR 연산자
실습:
- 반복 패턴과 그룹핑, OR 연산자를 사용한 예제

import re

# 반복 패턴 예제
pattern = r"a{2,3}"
text = "aaa aaaaa aa aaaa"
matches = re.findall(pattern, text)
print(matches)  # ['aaa', 'aaa', 'aa']

# 그룹핑과 캡처 예제
pattern = r"(abc)+"
text = "abcabcabc"
match = re.match(pattern, text)
if match:
    print("매칭된 그룹:", match.group(0))  # 'abcabcabc'

# OR 연산자 예제
pattern = r"cat|dog"
text = "I have a cat and a dog."
matches = re.findall(pattern, text)
print(matches)  # ['cat', 'dog']

Day 4: 정규표현식의 고급 패턴

강의 내용:
- 고급 메타문자
  - \b : 단어 경계
  - \B : 단어 비경계
  - \d : 숫자 문자
  - \D : 숫자 아닌 문자
  - \s : 공백 문자
  - \S : 공백 아닌 문자
  - \w : 단어 문자
  - \W : 단어 아닌 문자
실습:
- 고급 메타문자를 사용한 예제

import re

# \b, \B 예제
pattern = r"\bword\b"
text = "This is a word in a sentence."
matches = re.findall(pattern, text)
print(matches)  # ['word']

# \d, \D 예제
pattern = r"\d+"
text = "There are 123 apples."
matches = re.findall(pattern, text)
print(matches)  # ['123']

# \s, \S 예제
pattern = r"\s"
text = "Hello world"
matches = re.findall(pattern, text)
print(matches)  # [' ']

# \w, \W 예제
pattern = r"\w+"
text = "This is a test."
matches = re.findall(pattern, text)
print(matches)  # ['This', 'is', 'a', 'test']

Day 5: 정규표현식의 주요 함수

강의 내용:
- re 모듈의 주요 함수
  - match(): 문자열의 시작에서 패턴 매칭
  - search(): 문자열 전체에서 첫 번째 패턴 매칭
  - findall(): 패턴에 매칭되는 모든 문자열 리스트 반환
  - finditer(): 패턴에 매칭되는 모든 문자열의 iterator 반환
  - sub(): 패턴에 매칭되는 부분 문자열을 다른 문자열로 대체
실습:
- re 모듈의 주요 함수를 사용한 예제

import re

# match() 함수 예제
pattern = r"Hello"
text = "Hello, world!"
match = re.match(pattern, text)
if match:
    print("매칭됨:", match.group())  # 'Hello'

# search() 함수 예제
pattern = r"world"
text = "Hello, world!"
match = re.search(pattern, text)
if match:
    print("매칭됨:", match.group())  # 'world'

# findall() 함수 예제
pattern = r"\d+"
text = "123 abc 456 def"
matches = re.findall(pattern, text)
print(matches)  # ['123', '456']

# finditer() 함수 예제
pattern = r"\d+"
text = "123 abc 456 def"
matches = re.finditer(pattern, text)
for match in matches:
    print("매칭된 위치:", match.start(), "-", match.end())  # 0 - 3, 8 - 11

# sub() 함수 예제
pattern = r"apple"
text = "I like apples. Apples are good."
result = re.sub(pattern, "orange", text, flags=re.IGNORECASE)
print(result)  # 'I like oranges. oranges are good.'

Day 6: 정규표현식의 실용 예제

강의 내용:
- 정규표현식을 활용한 실용 예제
  - 이메일 주소 검증
  - 전화번호 형식 변환
  - 텍스트에서 URL 추출
실습:
- 정규표현식을 사용한 실용 예제 작성

import re

# 이메일 주소 검증
pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
emails = ["test@example.com", "invalid-email@", "user@domain.co"]

for email in emails:
    if re.match(pattern, email):
        print(f"유효한 이메일: {email}")
    else:
        print(f"유효하지 않은 이메일: {email}")

# 전화번호 형식 변환
pattern = r"(\d{3})-(\d{3})-(\d{4})"
text = "My phone number is 123-456-7890."
new_text = re.sub(pattern, r"(\1) \2-\3", text)
print(new_text)  # 'My phone number is (123) 456-7890.'

# 텍스트에서 URL 추출
pattern = r"http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+"
text = "Visit https://www.example.com and http://www.test.com for more information."
urls = re.findall(pattern, text)
print(urls)  # ['https://www.example.com', 'http://www.test.com']

Day 7: 정규표현식 종합 연습 및 프로젝트

강의 내용:
- 정규표현식 종합 연습 문제 풀이
  - 다양한 정규표현식 문제
  - Q&A 세션
- 미니 프로젝트
  - 주제 선정 및 프로그램 설계
  - 정규표현식을 활용한 프로그램 구현 및 테스트
실습:
- 종합 연습 문제 풀기
- 미니 프로젝트 작성 및 발표

# 연습 문제 1: 문자열에서 모든 소문자 모음을 찾아 리스트로 반환하는 함수 작성
def find_vowels(text):
    pattern = r"[aeiou]"
    return re.findall(pattern, text)

print(find_vowels("Hello, World!"))  # ['e', 'o', 'o']

# 연습 문제 2: 문자열에서 모든 단어를 추출하여 리스트로 반환하는 함수 작성
def find_words(text):
    pattern = r"\b\w+\b"
    return re.findall(pattern, text)

print(find_words("This is a sample sentence."))  # ['This', 'is', 'a', 'sample', 'sentence']

# 연습 문제 3: 문자열에서 모든 2자리 숫자를 찾아 리스트로 반환하는 함수 작성
def find_two_digit_numbers(text):
    pattern = r"\b\d{2}\b"
    return re.findall(pattern, text)

print(find_two_digit_numbers("12, 123, 45, 6, 789"))  # ['12', '45']

# 미니 프로젝트 예제: 텍스트 파일에서 이메일 주소와 전화번호 추출 프로그램
def extract_emails_and_phones(text):
    email_pattern = r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"
    phone_pattern = r"\b\d{3}[-.]?\d{3}[-.]?\d{4}\b"
    emails = re.findall(email_pattern, text)
    phones = re.findall(phone_pattern, text)
    return emails, phones

sample_text = """
Contact us at support@example.com or call 123-456-7890.
For sales, email sales@example.com or call 987.654.3210.
Visit our website at www.example.com.
"""

emails, phones = extract_emails_and_phones(sample_text)
print("Emails:", emails)  # ['support@example.com', 'sales@example.com']
print("Phones:", phones)  # ['123-456-7890', '987.654.3210']

이 강의는 파이썬의 정규표현식을 익히는 것을 목표로 하며, 각 강의는 이론과 실습을 포함합니다. 다음 주차에 대한 상세 강의를 원하시면 말씀해 주세요!

저작자표시 비영리 변경금지 (새창열림)

'-----ETC2----- > Python' 카테고리의 다른 글

[Python] Week 14: 예외 처리 (0)	2024.06.01
[Python] Week 13: 파일 입출력 (0)	2024.06.01
[Python] Week 11: 문자열 조작 (0)	2024.06.01
[Python] Week 10: 집합 (0)	2024.06.01
[Python] Week 9: 딕셔너리 (0)	2024.06.01

cogito30's AI Develope Blog

[Python] Week 12: 문자열 - 정규표현식

'-----ETC2----- > Python' 카테고리의 다른 글

티스토리툴바

[Python] Week 12: 문자열 - 정규표현식

'-----ETC2----- > Python' 카테고리의 다른 글

관련글

티스토리툴바