Python TextBuffer 구현

2025-08-21

Developer: 현재 Verilog 파일 처리와 관련된 여러 작업(예: 간단한 전처리/후처리 등)을 수행 중입니다. Raw Verilog 소스 코드를 저장할 수 있는 파이썬 클래스 개발을 검토하고 있습니다. Verilog 소스 코드는 문자열 형태로 보관하거나, 줄 단위로 split하여 List[str] 타입(줄바꿈 없이)으로 저장하거나, splitlines(keepends=True)를 사용해 줄바꿈이 포함된 리스트로 보관할 수 있습니다. 여러 방식이 프로젝트 내에서 혼용되면 혼란을 초래할 수 있으므로, 범용적으로 사용할 수 있도록 일관된 구조를 하나의 클래스에 통일할 것을 권장합니다. Begin with a concise checklist (3-7 bullets) of what you will do; keep items conceptual, not implementation-level. After each significant code or class design decision, briefly validate if the chosen approach aligns with the project's consistency and extensibility goals, and self-correct if needed.

here’s the game plan:

Define one canonical, internal representation for Verilog source and convert to/from other views.
Normalize and track newline style for faithful round-trips (LF/CRLF/mixed).
Expose consistent views (as_string(), as_lines(keepends=...)) instead of multiple storage forms.
Provide a small, extensible transform pipeline for pre/post-processing (with history).
Make edits immutable by default (return a new instance) with an opt-in in_place=True.
Add I/O helpers (from_*, to_file) with explicit encoding and deterministic newline policy.

Key design decisions (with quick self-checks)

Canonical storage = one normalizedstr (\n line endings).
- Why: single source of truth avoids drift between parallel representations; Python string ops are fast and familiar.
- Consistency/Extensibility check: callers always interact with one invariant; adding new views later won’t change storage. ✅
Preserve original newline style as metadata ('lf'|'crlf'|'mixed').
- Why: we normalize internally but can emit exactly what the file used before, unless caller overrides.
- Check: round-tripping is predictable across the project; future formatters can opt into a uniform style. ✅
Views, not storage: as_string(newline=...) and as_lines(keepends=False|True, newline=...).
- Why: banishes ad-hoc lists/strings in the codebase; one class hands out consistent views.
- Check: API encourages a single pattern everywhere; avoids mixing representations in state. ✅
Edits are immutable by default, within_place=True for performance-critical paths.
- Why: functional style reduces surprise; explicit mutability keeps hot paths fast.
- Check: consistent with safety + speed; cache invalidation is centralized. ✅
Lazy, cached line splits (both with and withoutkeepends).
- Why: many tools need line access; caching avoids repeated splitlines.
- Check: cache invalidated on any text change; future features (e.g., line index) can reuse it. ✅
Transform pipeline:apply(transformer) where a transformer is a callable text -> text, plus history.
- Why: flexible slot for “pre/post-processing” like trimming, EOL fixing, simple macro munging, etc.
- Check: keeps responsibilities modular; easy to add/remove steps; provenance is recorded. ✅
I/O boundaries explicit: constructors for string/lines/file; to_file(encoding, newline); no auto-detection magic.
- Why: explicit beats implicit; fewer surprises across platforms and CI.
- Check: project-wide consistency; easy to thread encoding/newline policy from config. ✅

The class

from __future__ import annotations
from dataclasses import dataclass, field
from pathlib import Path
from typing import Callable, Iterable, Literal, Optional, Protocol, Tuple
import hashlib
import time

NewlineStyle = Literal["lf", "crlf", "mixed"]
EmitNewline = Literal["original", "lf", "crlf"]

class TextTransformer(Protocol):
    """A simple, composable transformer: text -> text."""
    name: str
    def __call__(self, text: str) -> str: ...

@dataclass(frozen=True)
class TransformRecord:
    name: str
    before_sha1: str
    after_sha1: str
    timestamp: float = field(default_factory=time.time)

class VerilogSource:
    """
    Canonical container for Verilog source.

    Invariants:
      - Internal storage `_text` always uses LF ('\n') line endings.
      - `_newline_style` records the original newline style encountered on load.
      - All edit methods are immutable by default; pass in_place=True to mutate.
    """
    __slots__ = (
        "_text",
        "_newline_style",
        "_encoding",
        "_source_path",
        "_version",
        "_cached_lines",
        "_cached_lines_keepends",
        "_history",
    )

    def __init__(
        self,
        text: str,
        *,
        newline_style: Optional[NewlineStyle] = None,
        encoding: str = "utf-8",
        source_path: Optional[Path] = None,
        _assume_normalized: bool = False,
    ) -> None:
        if _assume_normalized:
            norm = text
            style = newline_style or "lf"
        else:
            norm, style = self._normalize_newlines(text, prefer="lf")  # internal LF
            if newline_style and newline_style != style:
                # honor explicit override only for "mixed" -> "lf/crlf" upgrade
                style = newline_style

        self._text: str = norm
        self._newline_style: NewlineStyle = style  # how it *came in* (for round-trip)
        self._encoding: str = encoding
        self._source_path: Optional[Path] = Path(source_path) if source_path else None

        self._version: int = 0
        self._cached_lines: Optional[list[str]] = None              # keepends=False
        self._cached_lines_keepends: Optional[list[str]] = None     # keepends=True
        self._history: list[TransformRecord] = []

    # ---------- creation / I/O ----------

    @classmethod
    def from_string(
        cls,
        text: str,
        *,
        encoding: str = "utf-8",
        source_path: Optional[Path] = None,
    ) -> "VerilogSource":
        return cls(text, encoding=encoding, source_path=source_path)

    @classmethod
    def from_lines(
        cls,
        lines: Iterable[str],
        *,
        keepends: bool = False,
        encoding: str = "utf-8",
        source_path: Optional[Path] = None,
    ) -> "VerilogSource":
        if keepends:
            joined = "".join(lines)
        else:
            joined = "\n".join(lines)
        return cls(joined, encoding=encoding, source_path=source_path)

    @classmethod
    def from_file(
        cls,
        path: Path | str,
        *,
        encoding: str = "utf-8",
    ) -> "VerilogSource":
        p = Path(path)
        text = p.read_text(encoding=encoding)
        return cls(text, encoding=encoding, source_path=p)

    def to_file(
        self,
        path: Path | str,
        *,
        encoding: Optional[str] = None,
        newline: EmitNewline = "original",
        ensure_trailing_newline: Optional[bool] = None,
    ) -> None:
        p = Path(path)
        s = self.as_string(newline=newline, ensure_trailing_newline=ensure_trailing_newline)
        p.write_text(s, encoding=encoding or self._encoding)

    # ---------- core views ----------

    def as_string(
        self,
        *,
        newline: EmitNewline = "original",
        ensure_trailing_newline: Optional[bool] = None,
    ) -> str:
        s = self._text
        if ensure_trailing_newline is True and (not s.endswith("\n")):
            s = s + "\n"
        if ensure_trailing_newline is False and s.endswith("\n"):
            s = s[:-1]

        if newline == "original":
            style = self._newline_style
        else:
            style = {"lf": "lf", "crlf": "crlf"}[newline]

        if style == "lf":
            return s
        elif style == "crlf":
            return s.replace("\n", "\r\n")
        else:  # "mixed" -> default to CRLF for lines that originally were CRLF? We can't reconstruct mixed reliably.
            # project policy: emit LF for mixed unless caller overrides
            return s

    def as_lines(self, *, keepends: bool = False) -> list[str]:
        if keepends:
            if self._cached_lines_keepends is None:
                self._cached_lines_keepends = self._text.splitlines(keepends=True)
            return list(self._cached_lines_keepends)
        else:
            if self._cached_lines is None:
                self._cached_lines = self._text.splitlines(keepends=False)
            return list(self._cached_lines)

    # ---------- inspection ----------

    @property
    def source_path(self) -> Optional[Path]:
        return self._source_path

    @property
    def newline_style(self) -> NewlineStyle:
        return self._newline_style

    @property
    def encoding(self) -> str:
        return self._encoding

    @property
    def version(self) -> int:
        return self._version

    @property
    def history(self) -> Tuple[TransformRecord, ...]:
        return tuple(self._history)

    @property
    def line_count(self) -> int:
        return len(self.as_lines())

    def get_line(self, idx: int, *, keepend: bool = False) -> str:
        lines = self.as_lines(keepends=keepend)
        return lines[idx]

    # ---------- transforms / edits ----------

    def apply(
        self,
        transformer: TextTransformer,
        *,
        in_place: bool = False,
        record: bool = True,
    ) -> "VerilogSource":
        before = self._text
        after = transformer(before)
        if after is before:
            # even if identity, return self or bump version in-place
            if in_place:
                self._bump_version()
            return self

        if in_place:
            self._set_text(after)
            if record:
                self._record(transformer.name, before, after)
            return self
        else:
            inst = VerilogSource(
                after,
                newline_style=self._newline_style,
                encoding=self._encoding,
                source_path=self._source_path,
            )
            if record:
                inst._history = [*self._history, self._mk_record(transformer.name, before, after)]
            return inst

    def replace(
        self,
        old: str,
        new: str,
        *,
        in_place: bool = False,
    ) -> "VerilogSource":
        class _Repl:
            name = "replace"
            def __call__(self, text: str) -> str:
                return text.replace(old, new)
        return self.apply(_Repl(), in_place=in_place)

    def set_line(
        self,
        idx: int,
        content: str,
        *,
        keepend: bool = False,
        in_place: bool = False,
    ) -> "VerilogSource":
        lines = self.as_lines(keepends=True)  # edit with keepends to keep layout
        if keepend:
            # caller provided EOL
            lines[idx] = content
        else:
            # normalize to internal LF
            eol = "\n" if lines[idx].endswith("\n") else ""
            lines[idx] = content + eol
        new_text = "".join(lines)
        return self._with_text(new_text, in_place=in_place)

    def insert_lines(
        self,
        idx: int,
        new_lines: Iterable[str],
        *,
        keepends: bool = False,
        in_place: bool = False,
    ) -> "VerilogSource":
        nl = [ln if keepends else (ln + "\n") for ln in new_lines]
        lines = self.as_lines(keepends=True)
        new_text = "".join(lines[:idx] + nl + lines[idx:])
        return self._with_text(new_text, in_place=in_place)

    def delete_lines(
        self,
        start: int,
        end: int,
        *,
        in_place: bool = False,
    ) -> "VerilogSource":
        lines = self.as_lines(keepends=True)
        new_text = "".join(lines[:start] + lines[end:])
        return self._with_text(new_text, in_place=in_place)

    def ensure_trailing_newline(self, *, in_place: bool = False) -> "VerilogSource":
        if self._text.endswith("\n"):
            return self
        return self._with_text(self._text + "\n", in_place=in_place)

    # ---------- internals ----------

    @staticmethod
    def _normalize_newlines(text: str, *, prefer: NewlineStyle = "lf") -> tuple[str, NewlineStyle]:
        has_crlf = "\r\n" in text
        has_lf = "\n" in text
        if has_crlf and has_lf and text.replace("\r\n", "\n") != text:
            style: NewlineStyle = "mixed"
        elif has_crlf:
            style = "crlf"
        else:
            style = "lf"
        norm = text.replace("\r\n", "\n").replace("\r", "\n")  # coerce to LF
        return norm, style

    def _with_text(self, new_text: str, *, in_place: bool) -> "VerilogSource":
        if in_place:
            self._set_text(new_text)
            return self
        inst = VerilogSource(
            new_text,
            newline_style=self._newline_style,
            encoding=self._encoding,
            source_path=self._source_path,
            _assume_normalized=True,
        )
        inst._history = list(self._history)
        return inst

    def _set_text(self, new_text: str) -> None:
        self._text = new_text
        self._invalidate_cache()
        self._bump_version()

    def _invalidate_cache(self) -> None:
        self._cached_lines = None
        self._cached_lines_keepends = None

    def _bump_version(self) -> None:
        self._version += 1

    def _mk_record(self, name: str, before: str, after: str) -> TransformRecord:
        return TransformRecord(
            name=name,
            before_sha1=hashlib.sha1(before.encode("utf-8")).hexdigest(),
            after_sha1=hashlib.sha1(after.encode("utf-8")).hexdigest(),
        )

    def _record(self, name: str, before: str, after: str) -> None:
        self._history.append(self._mk_record(name, before, after))

Validation: The class enforces one storage invariant (LF string), exposes consistent views, and gives predictable I/O. It’s easy to extend with more transforms or richer metadata later. This aligns with our consistency goal and makes future preprocessors trivial to add. ✅

Example transformers (plug-and-play)

class TrimTrailingWhitespace:
    name = "trim_trailing_ws"
    def __call__(self, text: str) -> str:
        # safe for Verilog; does not alter semantics
        return "\n".join(line.rstrip(" \t") for line in text.splitlines())

class EnsureFinalNewline:
    name = "ensure_final_newline"
    def __call__(self, text: str) -> str:
        return text if text.endswith("\n") else (text + "\n")

Validation: Transformers are plain callables; team members can add steps without touching the core class. History records provenance. ✅

Typical usage (uniform across the project)

# Load
src = VerilogSource.from_file("rtl/top.v", encoding="utf-8")

# Get views (without changing internal storage form)
text = src.as_string()                            # normalized LF by default (round-tripped on write)
lines = src.as_lines()                            # ['module top(', ...]
lines_ke = src.as_lines(keepends=True)            # ['module top(\n', ...]

# Edits (immutable by default)
src2 = src.set_line(0, "// AUTOGEN: do not edit")
src3 = src2.apply(TrimTrailingWhitespace())

# Save with original newline style, enforcing final newline
src3.to_file("rtl/top.v", newline="original", ensure_trailing_newline=True)

Validation: This usage pattern prevents any accidental mixing of storage forms. Every operation goes through one API, keeping the project consistent. ✅

Why this will hold up on bigger needs

Round-trip safety: We track original newline style and encoding; callers can enforce a project-wide style by always passing newline="lf" if desired.
Performance: Internal str, lazy line cache, optional in-place edits.
Extensibility: Add richer transforms (comment stripping, lightweight \line continuation` merges, macro stubs) without changing the core.
Determinism: No auto encoding detection; configs drive output style.

Optional next steps (if you want them)

Add a LineIndex helper for O(1) char-offset ↔ line/col mapping.
Provide diff(other) to aid review bots.
Introduce a config object (encoding/newline policy) to thread through I/O.

If you share any project constraints (e.g., must always emit LF, or enforce final newline), I can wire them into defaults so every team member gets the same behavior automatically.

보다보니 굳이 VerilogSource에 한정될 필요는 없는거 같은데 범용적인 방법 추천있어?

아주 좋아요—Verilog 전용이 아니라 언어 무관(Text-agnostic) 하게 쓰면 팀 전체에 더 깔끔합니다. 아래처럼 일반화하죠.

체크리스트 (개념 위주)

하나의 범용 텍스트 컨테이너 로 표준화하고, 내부 표현은 단일 규칙(LF)로 고정
프로필(언어/포맷 메타데이터) 레이어로 주석/확장자 등만 바꿔 끼우기
일관된 뷰 API(문자열/줄 리스트)만 노출, 저장 형식은 1개
변환 파이프라인(transformer: text -> text)과 기록(히스토리) 제공
불변 편집 기본 + 선택적 in-place , 캐시/버전 관리
명시적 I/O 정책(인코딩/개행 방침)과 라운드트립 보장

설계 스텝(각 스텝마다 빠른 검증 포함)

범용 컨테이너 =TextBuffer (내부 LF 문자열 1개만 저장)
- 이유: 상태가 하나면 표현 불일치가 사라짐.
- 검증: 프로젝트 일관성(✅) / 다른 언어로 확장 쉬움(✅).
메타데이터/프로필(TextProfile) 분리
- 이름, 대표 확장자, 주석 마커 등은 “설명”일 뿐, 동작은 동일.
- 검증: 텍스트 코어는 같고, 언어별 규칙은 얹기만 함 → 확장성(✅).
뷰 API =as_string(newline=...), as_lines(keepends=...)
- 저장은 1종, 뷰만 여러 형태로 “보여주기”.
- 검증: 코드베이스에서 혼용 방지(✅). 새 뷰 추가 시 내부는 그대로(✅).
개행 정책과 라운드트립
- 로드 시 CRLF/혼합 감지 → 내부 LF로 정규화 + “원래 스타일” 메타 보존.
- 저장 시 original|lf|crlf 선택, ensure_trailing_newline 옵션 제공.
- 검증: 재현성/예측가능성(✅). 혼합 은 완전 복원 불가 → 정책상 LF로 내보내기(합리적 셀프코렉트).
변환 파이프라인 + 역사(sha1)
- apply(transformer)로 조합 가능; 이름/해시 기록.
- 검증: 단계적 전/후처리의 표준 통로(✅). 원인 추적 용이(✅).
편집 모델: 불변 기본 +in_place=True
- 안전한 기본값, 고성능 경로 필요 시 명시적 사용.
- 검증: 안전/성능 균형(✅). 캐시 무효화 한 곳에서 관리(✅).
I/O 경계는 명시적
- from_*/to_file(encoding,newline)에서만 환경 영향을 받도록.
- 검증: CI/OS 차이 최소화(✅).

범용 구현 스켈레톤 (Python)

from __future__ import annotations
from dataclasses import dataclass, field
from pathlib import Path
from typing import Callable, Iterable, Literal, Optional, Protocol, Tuple
import hashlib, time

# ---- Profiles (언어/포맷 설명용 메타) ----
@dataclass(frozen=True)
class TextProfile:
    name: str = "plain"
    file_exts: tuple[str, ...] = ()
    line_comment_markers: tuple[str, ...] = ()
    block_comment_markers: tuple[tuple[str, str], ...] = ()

# 예시 프로필
VERILOG = TextProfile(
    name="verilog",
    file_exts=(".v", ".sv"),
    line_comment_markers=("//",),
    block_comment_markers=(("/*","*/"),),
)
PYTHON = TextProfile(
    name="python",
    file_exts=(".py",),
    line_comment_markers=("#",),
)

NewlineStyle = Literal["lf", "crlf", "mixed"]
EmitNewline = Literal["original", "lf", "crlf"]

class TextTransformer(Protocol):
    name: str
    def __call__(self, text: str) -> str: ...

@dataclass(frozen=True)
class TransformRecord:
    name: str
    before_sha1: str
    after_sha1: str
    timestamp: float = field(default_factory=time.time)

class TextBuffer:
    """
    범용 텍스트 컨테이너(언어-무관).
    내부 규칙:
      - _text 는 항상 LF('\\n')로 정규화된 단일 문자열
      - 원본 개행 스타일/인코딩/경로/프로필을 메타로 보관
      - 불변 편집 기본, in_place=True로 명시적 변경 가능
    """
    __slots__ = ("_text","_newline_style","_encoding","_source_path","_profile",
                 "_version","_cached_lines","_cached_lines_keepends","_history")

    def __init__(
        self,
        text: str,
        *,
        profile: TextProfile = TextProfile(),
        newline_style: Optional[NewlineStyle] = None,
        encoding: str = "utf-8",
        source_path: Optional[Path] = None,
        _assume_normalized: bool = False,
    ) -> None:
        if _assume_normalized:
            norm = text
            style = newline_style or "lf"
        else:
            norm, style = self._normalize_newlines(text)

        self._text: str = norm
        self._newline_style: NewlineStyle = style
        self._encoding: str = encoding
        self._source_path: Optional[Path] = Path(source_path) if source_path else None
        self._profile: TextProfile = profile

        self._version: int = 0
        self._cached_lines: Optional[list[str]] = None
        self._cached_lines_keepends: Optional[list[str]] = None
        self._history: list[TransformRecord] = []

    # ---------- constructors / I-O ----------
    @classmethod
    def from_string(cls, text: str, *, profile: TextProfile = TextProfile(),
                    encoding: str = "utf-8", source_path: Optional[Path] = None) -> "TextBuffer":
        return cls(text, profile=profile, encoding=encoding, source_path=source_path)

    @classmethod
    def from_lines(cls, lines: Iterable[str], *, keepends: bool = False,
                   profile: TextProfile = TextProfile(), encoding: str = "utf-8",
                   source_path: Optional[Path] = None) -> "TextBuffer":
        joined = "".join(lines) if keepends else "\n".join(lines)
        return cls(joined, profile=profile, encoding=encoding, source_path=source_path)

    @classmethod
    def from_file(cls, path: Path | str, *, profile: TextProfile = TextProfile(),
                  encoding: str = "utf-8") -> "TextBuffer":
        p = Path(path)
        text = p.read_text(encoding=encoding)
        return cls(text, profile=profile, encoding=encoding, source_path=p)

    def to_file(self, path: Path | str, *, encoding: Optional[str] = None,
                newline: EmitNewline = "original",
                ensure_trailing_newline: Optional[bool] = None) -> None:
        p = Path(path)
        s = self.as_string(newline=newline, ensure_trailing_newline=ensure_trailing_newline)
        p.write_text(s, encoding=encoding or self._encoding)

    # ---------- views ----------
    def as_string(self, *, newline: EmitNewline = "original",
                  ensure_trailing_newline: Optional[bool] = None) -> str:
        s = self._text
        if ensure_trailing_newline is True and not s.endswith("\n"):
            s += "\n"
        if ensure_trailing_newline is False and s.endswith("\n"):
            s = s[:-1]

        style = self._newline_style if newline == "original" else {"lf":"lf","crlf":"crlf"}[newline]
        if style == "lf":
            return s
        elif style == "crlf":
            return s.replace("\n", "\r\n")
        else:  # mixed는 정확 복원이 불가 → 정책상 LF로 내보냄
            return s

    def as_lines(self, *, keepends: bool = False) -> list[str]:
        if keepends:
            if self._cached_lines_keepends is None:
                self._cached_lines_keepends = self._text.splitlines(keepends=True)
            return list(self._cached_lines_keepends)
        else:
            if self._cached_lines is None:
                self._cached_lines = self._text.splitlines(keepends=False)
            return list(self._cached_lines)

    # ---------- inspection ----------
    @property
    def profile(self) -> TextProfile: return self._profile
    @property
    def source_path(self) -> Optional[Path]: return self._source_path
    @property
    def newline_style(self) -> NewlineStyle: return self._newline_style
    @property
    def encoding(self) -> str: return self._encoding
    @property
    def version(self) -> int: return self._version
    @property
    def history(self) -> Tuple[TransformRecord, ...]: return tuple(self._history)
    @property
    def line_count(self) -> int: return len(self.as_lines())

    def get_line(self, idx: int, *, keepend: bool = False) -> str:
        return self.as_lines(keepends=keepend)[idx]

    # ---------- edits / transforms ----------
    def apply(self, transformer: TextTransformer, *, in_place: bool = False,
              record: bool = True) -> "TextBuffer":
        before = self._text
        after = transformer(before)
        if after is before:
            if in_place: self._bump_version()
            return self
        if in_place:
            self._set_text(after)
            if record: self._record(transformer.name, before, after)
            return self
        else:
            inst = TextBuffer(after, profile=self._profile, newline_style=self._newline_style,
                              encoding=self._encoding, source_path=self._source_path)
            if record: inst._history = [*self._history, self._mk_record(transformer.name, before, after)]
            return inst

    def replace(self, old: str, new: str, *, in_place: bool = False) -> "TextBuffer":
        class _Repl:
            name = "replace"
            def __call__(self, text: str) -> str: return text.replace(old, new)
        return self.apply(_Repl(), in_place=in_place)

    def set_line(self, idx: int, content: str, *, keepend: bool = False,
                 in_place: bool = False) -> "TextBuffer":
        lines = self.as_lines(keepends=True)
        if keepend:
            lines[idx] = content
        else:
            eol = "\n" if lines[idx].endswith("\n") else ""
            lines[idx] = content + eol
        return self._with_text("".join(lines), in_place=in_place)

    def insert_lines(self, idx: int, new_lines: Iterable[str], *, keepends: bool = False,
                     in_place: bool = False) -> "TextBuffer":
        nl = [ln if keepends else (ln + "\n") for ln in new_lines]
        lines = self.as_lines(keepends=True)
        return self._with_text("".join(lines[:idx] + nl + lines[idx:]), in_place=in_place)

    def delete_lines(self, start: int, end: int, *, in_place: bool = False) -> "TextBuffer":
        lines = self.as_lines(keepends=True)
        return self._with_text("".join(lines[:start] + lines[end:]), in_place=in_place)

    def ensure_trailing_newline(self, *, in_place: bool = False) -> "TextBuffer":
        return self if self._text.endswith("\n") else self._with_text(self._text + "\n", in_place=in_place)

    # ---------- internals ----------
    @staticmethod
    def _normalize_newlines(text: str) -> tuple[str, NewlineStyle]:
        has_crlf = "\r\n" in text
        has_lf = "\n" in text
        if has_crlf and has_lf and text.replace("\r\n", "\n") != text: style: NewlineStyle = "mixed"
        elif has_crlf: style = "crlf"
        else: style = "lf"
        return text.replace("\r\n","\n").replace("\r","\n"), style

    def _with_text(self, new_text: str, *, in_place: bool) -> "TextBuffer":
        if in_place:
            self._set_text(new_text); return self
        inst = TextBuffer(new_text, profile=self._profile, newline_style=self._newline_style,
                          encoding=self._encoding, source_path=self._source_path, _assume_normalized=True)
        inst._history = list(self._history)
        return inst

    def _set_text(self, new_text: str) -> None:
        self._text = new_text; self._invalidate_cache(); self._bump_version()

    def _invalidate_cache(self) -> None:
        self._cached_lines = None; self._cached_lines_keepends = None

    def _bump_version(self) -> None: self._version += 1

    def _mk_record(self, name: str, before: str, after: str) -> TransformRecord:
        return TransformRecord(
            name=name,
            before_sha1=hashlib.sha1(before.encode("utf-8")).hexdigest(),
            after_sha1=hashlib.sha1(after.encode("utf-8")).hexdigest(),
        )
    def _record(self, name: str, before: str, after: str) -> None:
        self._history.append(self._mk_record(name, before, after))

검증 요약: 내부 저장 1종(LF), 뷰 API 고정, 변환/편집/히스토리 표준화 → 일관성 과 확장성 목표에 부합(✅). 혼합 개행 복원 한계 는 정책으로 명시하여 팀 혼선을 차단(셀프코렉트 완료).

사용 예 (언어 불문)

# Verilog도 Python도 동일한 API
buf = TextBuffer.from_file("rtl/top.v", profile=VERILOG)

# 일관된 뷰
text = buf.as_string()
lines = buf.as_lines()
lines_ke = buf.as_lines(keepends=True)

# 불변 편집 + 파이프라인
class TrimTrailingWS:
    name = "trim_trailing_ws"
    def __call__(self, t: str) -> str:
        return "\n".join(line.rstrip(" \t") for line in t.splitlines())

buf2 = buf.set_line(0, "// AUTOGEN: do not edit")
buf3 = buf2.apply(TrimTrailingWS())
buf3.to_file("rtl/top.v", newline="original", ensure_trailing_newline=True)

Verilog 전용 API가 필요하면?

VerilogSource = TextBuffer에 profile=VERILOG만 기본값으로 래핑(또는 서브클래싱)하면 됩니다. 코드베이스는 항상TextBuffer API만 사용하므로 혼용이 사라집니다.

권장 정책(팀 규칙으로 고정 추천)

저장 시 기본 newline="original"(레거시 유지) → 리포맷 단계에서 일괄 lf 전환 선택 가능.
모든 변환은 Transformer 로만 수행(직접 문자열 조작 금지).
CI에 ensure_trailing_newline=True + TrimTrailingWS 적용.

필요하면 토큰화나 라인-오프셋 매핑, diff(other) 같은 유틸을 얹을 수 있게 설계해 두었습니다. 원하는 기본 프로필(Verilog/Lex/YAML 등) 목록을 알려주시면 템플릿으로 함께 추가해 드릴게요.

이런 일을 하는 유명한 클래스나 파이썬 내장은 없어?

다음 순서로 생각해볼게요 (개념 체크리스트):

표준 라이브러리에 “그 일을 전부” 하는 단일 클래스는 없다 → 조합해서 써야 함.
가장 가까운 표준 도구들 이 무엇인지 파악하고, 무엇이 부족 한지 명확히 하기.
부족한 부분(개행 추적·일관된 뷰·변환 파이프라인)은 얇은 래퍼 로 보완.
최소주의 버전(함수 2–3개)과 확장형(클래스) 중 팀 정책에 맞게 선택.

짧은 결론

표준/유명 내장에 “텍스트 컨테이너 + 개행 정책 + 변환/히스토리”를 한 번에 제공하는 건 없습니다.
따라서 지금까지 제안한 TextBuffer 같은 얇은 래퍼가 일관성 과 라운드트립 보장 면에서 가장 현실적입니다.
검증: 팀 전역 일관성과 확장성 목표에 부합. ✅

가장 가까운 표준 도구들(무엇을 해주고/못 해주는가)

io.StringIO
메모리 파일 객체. 줄 단위 읽기/쓰기 쉽지만 원본 개행 스타일 추적 X , 변환/히스토리 X.
검증: “일관된 내부 표현” 요구에 미달 → 래퍼 필요. ❌
open(..., newline=...) / pathlib.Path.read_text()/write_text()
텍스트 모드에서 개행 변환을 제어(쓰기 시 newline='\n' 강제 등) 가능. 하지만 읽을 때 원본이 CRLF/혼합이었는지 를 보존하진 않음(보통 \n으로 흡수).
검증: 라운드트립용 “원래 개행 스타일” 메타 보존이 없다 → 래퍼 필요. ❌
fileinput
여러 파일을 한 줄 스트림처럼 순회. 컨테이너 아님 , 상태/뷰/히스토리 X. ❌
difflib
차이 계산·패치 출력을 도와줌. 변환 파이프라인/저장소 는 아님. 보조 도구로는 유용. 🆗
(tokenize, ast, parso 등)
언어별 파서/토크나이저. “구문 인지 변환”에는 좋지만 범용 텍스트 컨테이너 역할 은 아님. 보조. 🆗

요약: 조각들은 있지만, “일관 컨테이너 + 개행 정책 + 변환/히스토리” 조합은 직접 얇게 감싸야 합니다.

최소주의 대안(클래스 없이 함수 3개로)

클래스가 부담이면, 팀 규칙을 함수로 고정하세요.

from pathlib import Path

def detect_newline_style(raw: bytes) -> str:  # "lf" | "crlf" | "mixed"
    crlf = raw.count(b"\r\n")
    lf = raw.count(b"\n")
    cr = raw.count(b"\r") - crlf
    if (crlf and (lf - crlf)) or cr > 0:  # 서로 섞여 있으면 mixed
        return "mixed"
    return "crlf" if crlf else "lf"

def normalize_to_lf(s: str) -> str:
    return s.replace("\r\n", "\n").replace("\r", "\n")

def emit_with_newline(s: str, style: str) -> str:
    return s if style == "lf" else s.replace("\n", "\r\n")  # mixed는 정책상 lf 취급 권장

사용 패턴:

raw = Path(p).read_bytes()
style = detect_newline_style(raw)          # 원본 스타일 메타
text = normalize_to_lf(raw.decode('utf-8'))
lines = text.splitlines()                  # 필요한 뷰
out = emit_with_newline(text, style)       # 저장 시 라운드트립

검증: “단일 내부 표현(LF 문자열) + 원본 개행 메타 보존”을 만족. 변환 파이프라인/히스토리는 여전히 없으므로, 필요 시 클래스로 승격. ✅

언제 표준만으로 충분하고, 언제 래퍼가 필요한가

충분한 경우: 단순 읽기/쓰기 + 포맷터(예: pre-commit 훅으로 trailing space 제거, EOF newline 강제). 변환 기록/뷰 일관성이 크게 중요치 않을 때.
래퍼 필요:
- 라운드트립 정확도(원본 개행 정책 유지)가 중요
- 코드베이스에서 “문자열/lines 혼용”을 금지하고 한 API로 통일 하고 싶을 때
- 전/후처리를 파이프라인 + 히스토리 로 관리하고 싶을 때

검증: 우리 요구사항(일관성·확장성·라운드트립) 기준으로는 래퍼가 맞음. ✅

참고(서드파티 조합 아이디어)

pre-commit / pre-commit-hooks : end-of-file-fixer, trailing-whitespace, mixed-line-ending 등으로 저장소 수준 위생 관리.
역할 보완: 파이프라인 일부를 CI로 밀어내 일관성 강화. ✅
difflib / unidiff : 변환 결과를 리뷰하기 쉬운 패치로 제시.
역할 보완: 팀 리뷰 경험 개선. ✅

정리

“그 일을 전부 하는” 유명 내장/클래스는 없음.
표준 도구(특히 open/newline, Path.write_text(newline=...))는 출력 정책 강제 에는 충분하나, 원본 개행 추적/일관 뷰/히스토리 는 커버하지 못함.
팀 차원의 일관성과 확장성을 원하면, 지금 설계한 얇은 범용 래퍼(또는 위의 함수 3개 + 규칙)가 가장 안전합니다.

@classmethod def from_file(cls, path: Path | str, *, profile: TextProfile = TextProfile(), encoding: str = "utf-8") -> "TextBuffer": p = Path(path) text = p.read_text(encoding=encoding) return cls(text, profile=profile, encoding=encoding, source_path=p) 여기 중간에 있는 * 은 뭐야? 이런 문법 처음봐

짧게 체크리스트:

* 는 키워드 전용(Keyword-only) 인자 구분자 예요.
* 뒤에 나오는 파라미터는 반드시name=value 형태로 호출해야 해요.
목적: 가독성↑ , 실수 방지(순서 착각), API 확장성(새 옵션 추가해도 안전).
* 자체는 *args와 다릅니다(수집 아님, 구분자 일 뿐).

한 줄 요약

def from_file(cls, path: Path | str, *, profile: TextProfile = TextProfile(),
              encoding: str = "utf-8") -> "TextBuffer":

여기 가운데 * 때문에 profile과 encoding은 키워드로만 넘길 수 있습니다.

✅ 올바른 호출:

TextBuffer.from_file("rtl/top.v")  # 기본값 사용
TextBuffer.from_file("rtl/top.v", profile=VERILOG)
TextBuffer.from_file("rtl/top.v", encoding="utf-16")

❌ 잘못된 호출(순서로 넘김):

TextBuffer.from_file("rtl/top.v", VERILOG)
# TypeError: ... takes 2 positional arguments but 3 were given

왜 쓰나요? (step-by-step 사고)

의미가 분명해집니다.
profile=VERILOG 처럼 이름이 보이니, 인자 의미를 헷갈릴 일이 줄어요.
→ 프로젝트 일관성 목표에 부합. ✅
파라미터 순서 변경에 강합니다.
나중에 옵션을 더 추가해도 기존 호출부는 name=value라 깨지지 않음.
→ API 확장성 목표에 부합. ✅
실수 방지.
profile과 encoding을 순서로 바꿔 넣는 실수를 차단. ✅

`` 와 `args` 의 차이

* (단독): 구분자. 뒤의 인자들을 키워드 전용 으로 만듭니다.
*args: 남는 위치 인자들을 튜플로 수집 합니다. (수집 + 구분자 역할 모두)

예:

def f(a, *, b, c=0):         # b, c는 키워드-only
    ...

def g(a, *args, b, **kwargs): # a는 위치, args는 추가 위치들 수집, b는 키워드-only
    ...

보너스: `/` (positional-only) 도 있어요

Python 3.8+ 부터는 / 로 위치 전용 인자를 강제할 수 있습니다.

def h(a, /, b, *, c):  # a: 위치-only, b: 둘 다 가능, c: 키워드-only
    ...

타입 힌트 관련

Path | str 는 Python 3.10+ 의 Union 약식 문법입니다(이전 버전은 Union[Path, str]).
TextProfile() 을 기본값으로 두면 정의 시점에 한 번 생성된 인스턴스를 재사용합니다.
여기서는 @dataclass(frozen=True)라 불변 이므로 안전하지만,
일반적으로는 None 기본값 + 함수 내부에서 생성하는 패턴도 고려하세요.

필요하면, 팀 규칙으로 “옵션 인자는 모두 * 뒤에 배치”를 정해두면 호출부가 항상 name=value라 깔끔하고, 장기 유지보수에 유리합니다.

__slots__는 뭐야?

짧은 체크리스트(개념):

__slots__는 인스턴스가 가질 수 있는 속성 이름들을 고정 하는 메커니즘이다.
효과: 메모리 절감 , 속성 접근 소폭 빠름 , 오타/임의 속성 추가 방지.
선언: class C: __slots__ = ('a', 'b') 처럼 튜플/리스트/이터러블 에 이름을 나열.
주의: 동적 속성 추가 불가 , 상속/디폴트/__dict__/__weakref__ 처리에 신경.
권장 사용: 인스턴스가 아주 많고 필드 스키마가 고정 된 클래스.

한 줄 정의

__slots__를 정의하면 해당 클래스의 인스턴스는 per-instance__dict__를 만들지 않고, 미리 정해둔 슬롯(고정 필드)만 메모리에 compact 하게 저장한다.

왜 쓰나 (step-by-step)

메모리 절감 : __dict__(해시맵) 대신 슬롯 배열 을 써서 인스턴스당 수십 바이트~수백 바이트를 아낀다(필드 수/파이썬 구현에 따라 상이). 대량 객체(예: 수십만 개)에서 큰 차이.
속성 접근 비용 감소 : 해시 조회가 아닌 오프셋 접근 이라 약간 빠르다(미세하지만 누적되면 의미).
API 안전성 : 선언되지 않은 속성은 설정 자체가 에러 → 오타/임의 필드 방지.

어떻게 쓰나

class Node:
    __slots__ = ('value', 'next')  # 인스턴스 필드 스키마 고정

    def __init__(self, value, next=None):
        self.value = value   # OK
        self.next = next     # OK

n = Node(1)
n.value = 2      # OK
n.other = 3      # AttributeError: 'Node' object has no attribute 'other'

디폴트값 은 보통 __init__에서 채운다.

클래스 속성으로 기본값을 주고 싶다면:

class C:
    __slots__ = ('a',)
    a = 10  # 모든 인스턴스의 초기 읽기는 10으로 보이지만,
            # 인스턴스에 a를 설정하면 인스턴스 슬롯 값이 우선한다.

자주 부딪히는 포인트 (꼭 알아둘 것)

1) `dict` / `weakref`가 필요하면?

기본적으로 __slots__ 클래스에는 인스턴스__dict__가 없다.

예외적으로 동적 속성을 허용하고 싶다면 슬롯에 "__dict__"를 명시:

class C:
    __slots__ = ('a', '__dict__')  # 슬롯 + 동적 속성 허용

weakref를 쓰려면 "__weakref__"도 포함:

class C:
    __slots__ = ('a', '__weakref__')

2) 상속

슬롯 클래스의 서브클래스 가 __slots__를 정의하지 않으면, 서브클래스는 다시__dict__가 생긴다(동적 속성 가능).
→ 계속 제한하고 싶다면 서브클래스에서도__slots__를 정의해야 한다.

서브클래스에서 부모 슬롯 이름을 다시 적지 않는다. 새로 추가하는 슬롯만 나열:

class Base:
    __slots__ = ('a',)
    
class Child(Base):
    __slots__ = ('b',)  # 'a'는 이미 Base에 있으므로 중복 기재 X

3) 다중 상속

다중 상속에서 슬롯이 서로 충돌 하면 에러가 날 수 있다.
실무 팁: 다중 상속을 최소화하거나, 한쪽만 슬롯 을 쓰고 다른 쪽은 데이터 없는 믹스인으로 설계.

4) dataclass와의 결합

@dataclass(slots=True)(Py3.10+)를 쓰면 자동으로__slots__가 생성된다.
이미 수동으로 __slots__를 정의했다면, slots=True를 중복으로 쓰지 않는다(충돌/중복 정의 주의).

5) pickle / 직렬화

많은 경우 잘 작동 하지만, 어떤 프레임워크는 __dict__를 기대한다.
커스텀 직렬화 필요하면 __getstate__/__setstate__를 구현해 슬롯 값들을 딕트로 변환 하는 방식 사용.

6) 프로파일링/디버깅 도구

일부 도구가 obj.__dict__를 바로 펼치려다 당황할 수 있다.
이런 도구를 쓰면 "__dict__" 슬롯을 추가하거나, 보조 메서드(예: to_dict()) 제공.

언제 쓰면 좋은가 / 피해야 하는가

적합

인스턴스가 엄청 많다 (e.g., 파서 토큰, AST 노드, 로그 레코드 등).
속성 스키마가 변하지 않는다.
라이브러리/코어 레이어에서 API 안정성 과 메모리 효율 이 중요하다.

부적합

동적으로 속성을 붙였다 뗐다 해야 한다.
디버깅/관찰 도구가 __dict__에 강하게 의존한다.
다중 상속 구조가 복잡하다(충돌 위험).

여러분 코드베이스 관점(빠른 셀프 체크)

텍스트 버퍼/소스 컨테이너처럼 필드가 고정 되고, 많은 인스턴스 를 만들 가능성이 있다면 __slots__는 일관성/성능 목표에 부합합니다. ✅
단, 변환 파이프라인/메타 확장이 잦고 동적 필드 추가 가능성 이 있다면, __slots__ 대신 일반 클래스(또는 __slots__에 "__dict__" 포함)로 유연성 을 남기는 것이 안전합니다. ✅

미니 예시: dataclass + slots

from dataclasses import dataclass

@dataclass(slots=True)  # Py3.10+
class Token:
    kind: str
    value: str
    line: int
    col: int

메모리 절감 + 오타 방지 + 명확한 스키마를 한 번에 확보.

필요하면, 현재 설계한 TextBuffer/VerilogSource에 __slots__를 적용할 때의 필드 목록/상속 전략 을 바로 제안해 드릴게요.

그냥 타입 판별해서 자동으로 만들게 하는 방식 어때 from_lines, from_string, from_textbuffer 구별하기가 좀 귀찮은데, isinstance 같은걸로 비교해서 자동으로 만들게 하는거지(from_file은 별개로), 기존 함수도 두고

좋아요—“타입에 따라 자동 생성”을 넣는 건 편의성에 꽤 유용합니다. 다만 예측 가능성 을 해치지 않도록 규칙을 명확히 두면 좋아요.

체크리스트(개념)

단일 진입점 coerce(...)로 문자열/줄 리스트/TextBuffer 를 알아서 수용
from_*들은 그대로 유지(명시적 경로가 필요한 곳에서 사용)
모호성 규칙 고정: keepends 기본 동작과 판별 기준을 문서화
제3자 타입은 프로토콜/어댑터 로 확장 가능하게 설계
잘못된 타입은 명확한 오류 로 거절

설계 결론 & 빠른 검증

TextBuffer.coerce(obj, ...) 추가
- TextBuffer → 그대로 반환
- str → from_string
- list/tuple/Iterable[str] → from_lines (필요 시 materialize)
- 검증: 호출부 단순화(✅), 기존 명시 API와 공존(✅).
keepends 기본 정책
- keepends=None(기본): 한 줄이라도\n/\r가 포함되면 keepends=True로 간주, 아니면 False.
- keepends를 명시하면 그 값을 그대로 사용.
- 검증: 휴리스틱 최소화 + 결정적(✅). 과도한 “추측”은 피함(셀프 코렉트).
경계 명확화
- Path/bytes는 여기서 처리하지 않음. 파일 I/O는 from_file로만.
- 검증: I/O 사이드이펙트 금지로 일관성 유지(✅).

코드 추가 (기존 `TextBuffer`에 붙일 메서드)

from collections.abc import Iterable, Sequence

class TextBuffer:
    # ... 기존 코드 ...

    @classmethod
    def coerce(
        cls,
        obj,
        *,
        profile: TextProfile = TextProfile(),
        encoding: str = "utf-8",
        keepends: bool | None = None,
    ) -> "TextBuffer":
        """
        다양한 입력(obj)을 TextBuffer로 표준화.
        - TextBuffer: 그대로 반환
        - str: from_string
        - Iterable[str]: from_lines (필요 시 keepends 추정)
        - (주의) Path/bytes는 여기서 처리하지 않음 → from_file / decode 경로 사용
        """
        # 1) 이미 TextBuffer인 경우
        if isinstance(obj, TextBuffer):
            return obj

        # 2) 문자열
        if isinstance(obj, str):
            return cls.from_string(obj, profile=profile, encoding=encoding)

        # 3) 문자열 이터러블(리스트/튜플/제너레이터 등)
        if isinstance(obj, Iterable) and not isinstance(obj, (bytes, bytearray, str)):
            # materialize (제너레이터 보호 및 타입 검사)
            lines = list(obj)

            if not all(isinstance(x, str) for x in lines):
                raise TypeError("Iterable must yield str values")

            # keepends 규칙: 명시값 우선, 없으면 자동 판별
            ke = keepends if keepends is not None else cls._guess_keepends(lines)
            return cls.from_lines(lines, keepends=ke, profile=profile, encoding=encoding)

        # 4) 그 외 타입은 명확히 거절
        raise TypeError(
            f"Cannot coerce type {type(obj).__name__} to TextBuffer. "
            "Use from_file() for paths/files, or pass str/Iterable[str]/TextBuffer."
        )

    @staticmethod
    def _guess_keepends(lines: Sequence[str]) -> bool:
        """한 줄이라도 개행 문자를 포함하면 keepends=True로 간주."""
        for ln in lines:
            if ("\n" in ln) or ("\r" in ln):
                return True
        return False

검증: 단일 진입점으로 편의성을 제공하면서, 파일 I/O·바이트 처리 등 사이드이펙트는 명시 API로 분리 → 일관성/확장성 목표에 부합. ✅

사용 예

buf1 = TextBuffer.coerce("module top;\nendmodule\n", profile=VERILOG)
buf2 = TextBuffer.coerce(["module top;", "endmodule"], profile=VERILOG)              # keepends=False
buf3 = TextBuffer.coerce(["module top;\n", "endmodule\n"], profile=VERILOG)          # keepends=True (자동 판별)
buf4 = TextBuffer.coerce(existing_textbuffer)                                        # 그대로
# 파일은 명시 API 사용
buf5 = TextBuffer.from_file("rtl/top.v", profile=VERILOG)

확장 아이디어(필요 시)

모듈 전역ensure_textbuffer(obj, **opts) 함수를 functools.singledispatch로 제공(내부에서 위 coerce 호출).
외부 타입 지원: 객체가 to_textbuffer()(또는 __textbuffer__())를 구현했다면 그걸 우선 호출하는 어댑터 패턴 허용.

최종 판단

“자동 생성”은 생산성을 올리되, 명시 API를 남겨 경계를 분명히 하는 지금 방식이 가장 안전합니다.
휴리스틱은 최소화했고, 필요 시 keepends를 명시하면 항상 우선 합니다.
→ 프로젝트의 일관성(놀라움 최소화)과 확장성(타입 추가 여지) 목표에 적합합니다. ✅

coerce 단어가 좀 어려운데

좋은 포인트! 이름이 어렵게 느껴지면 쓰는 사람이 안 쓰게 됩니다 😅
아래처럼 바꾸면 어때요?

체크리스트

후보 이름 몇 개 제안하고 의미·관용성 비교
하나를 기본 추천 으로 선택
기존 coerce는 호환용 별칭 으로 남김
예시 코드/사용법 제시
선택이 일관성/확장성 목표에 맞는지 빠른 검증

이름 후보 (짧은 코멘트)

as_textbuffer(obj, ...) — 가장 추천. NumPy의 asarray 패턴처럼 “이걸 TextBuffer로 ‘바꿔서’ 주세요(이미면 그대로)”라는 의미가 직관적.
ensure_textbuffer(obj, ...) — “TextBuffer 되게 보장해줘”라는 뉘앙스. 명확하지만 약간 길어요.
from_any(obj, ...) — “아무 타입이나 받아들여 생성”. 기능은 맞지만 의미가 약간 포괄적.
wrap(obj, ...) / adapt(obj, ...) — 포장/어댑트 느낌. 다만 구체성이 떨어짐.
cast(...) — 타입 캐스트 뉘앙스로 혼동 우려 → 비추천.

결론: as_textbuffer를 기본으로, 클래스 메서드로는 from_any를 얹고, 기존 coerce는 alias로 두는 구성이 가장 읽기 쉽고 점진적입니다.

검증: 이름만 바꿔도 호출부 가독성↑(일관성 목표 부합) / API 확장 시에도 의미 유지(확장성 목표 부합). ✅

코드 스니펫 (기존 `coerce` 내용을 내부 구현으로 재사용)

# module-level 함수: 가장 눈에 잘 띄는 진입점
def as_textbuffer(
    obj,
    *,
    profile: TextProfile = TextProfile(),
    encoding: str = "utf-8",
    keepends: bool | None = None,
) -> "TextBuffer":
    return TextBuffer.from_any(obj, profile=profile, encoding=encoding, keepends=keepends)

class TextBuffer:
    # ... 기존 코드 ...

    # 새 이름: 의미가 분명한 클래스 메서드
    @classmethod
    def from_any(
        cls,
        obj,
        *,
        profile: TextProfile = TextProfile(),
        encoding: str = "utf-8",
        keepends: bool | None = None,
    ) -> "TextBuffer":
        return cls._from_any_impl(obj, profile=profile, encoding=encoding, keepends=keepends)

    # 과거 이름은 호환용 별칭으로 유지 (원하면 향후 deprecate)
    coerce = from_any

    @classmethod
    def _from_any_impl(
        cls,
        obj,
        *,
        profile: TextProfile,
        encoding: str,
        keepends: bool | None,
    ) -> "TextBuffer":
        from collections.abc import Iterable, Sequence

        if isinstance(obj, TextBuffer):
            return obj

        if isinstance(obj, str):
            return cls.from_string(obj, profile=profile, encoding=encoding)

        if isinstance(obj, Iterable) and not isinstance(obj, (bytes, bytearray, str)):
            lines = list(obj)
            if not all(isinstance(x, str) for x in lines):
                raise TypeError("Iterable must yield str values")
            ke = keepends if keepends is not None else cls._guess_keepends(lines)
            return cls.from_lines(lines, keepends=ke, profile=profile, encoding=encoding)

        raise TypeError(
            f"Cannot convert {type(obj).__name__} to TextBuffer. "
            "Use from_file() for paths/files, or pass str/Iterable[str]/TextBuffer."
        )

    @staticmethod
    def _guess_keepends(lines: "Sequence[str]") -> bool:
        return any(("\n" in ln) or ("\r" in ln) for ln in lines)

사용 예

buf1 = as_textbuffer("module top;\nendmodule\n", profile=VERILOG)
buf2 = as_textbuffer(["module top;", "endmodule"], profile=VERILOG)
buf3 = TextBuffer.from_any(["a\n", "b\n"], profile=VERILOG)  # 클래스 메서드 경로
buf4 = TextBuffer.coerce("...")  # 기존 코드도 그대로 동작(호환 alias)

빠른 검증 & 셀프코렉트

가독성/예측 가능성: as_textbuffer 이름만 보고도 역할이 보입니다. 혼동 여지 낮음. ✅
일관성: from_string/from_lines/from_file 같은 명시 경로를 유지하면서, “편의 경로”는 as_textbuffer로 통일. ✅
확장성: 제3자 타입이 늘어나면 _from_any_impl에 어댑터 훅을 추가하면 됨(예: obj.__textbuffer__() 지원 등). ✅
모호성 관리: 파일/바이트는 여전히 from_file/decode 경로로만 처리 → 사이드이펙트·추측 로직 최소화. ✅

원하시면 프로젝트 전역 네이밍 룰(예: “편의 변환 엔트리는 항상 as_*로 시작”)도 같이 정리해 드릴게요.

Key design decisions (with quick self-checks)

The class

Example transformers (plug-and-play)

Typical usage (uniform across the project)

Why this will hold up on bigger needs

Optional next steps (if you want them)

체크리스트 (개념 위주)

설계 스텝(각 스텝마다 빠른 검증 포함)

범용 구현 스켈레톤 (Python)

사용 예 (언어 불문)

Verilog 전용 API가 필요하면?

권장 정책(팀 규칙으로 고정 추천)

짧은 결론

가장 가까운 표준 도구들(무엇을 해주고/못 해주는가)

최소주의 대안(클래스 없이 함수 3개로)

언제 표준만으로 충분하고, 언제 래퍼가 필요한가

참고(서드파티 조합 아이디어)

정리

한 줄 요약

왜 쓰나요? (step-by-step 사고)

* 와 *args 의 차이

보너스: / (positional-only) 도 있어요

타입 힌트 관련

한 줄 정의

왜 쓰나 (step-by-step)

어떻게 쓰나

자주 부딪히는 포인트 (꼭 알아둘 것)

1) __dict__ / __weakref__가 필요하면?

2) 상속

3) 다중 상속

4) dataclass와의 결합

5) pickle / 직렬화

6) 프로파일링/디버깅 도구

언제 쓰면 좋은가 / 피해야 하는가

여러분 코드베이스 관점(빠른 셀프 체크)

미니 예시: dataclass + slots

체크리스트(개념)

설계 결론 & 빠른 검증

코드 추가 (기존 TextBuffer에 붙일 메서드)

사용 예

확장 아이디어(필요 시)

최종 판단

체크리스트

이름 후보 (짧은 코멘트)

코드 스니펫 (기존 coerce 내용을 내부 구현으로 재사용)

사용 예

빠른 검증 & 셀프코렉트

You may also enjoy

영문 키보드 한글 변환 버그 수정

QFileDialog 설명

CFileDialog 구현 방법

`` 와 `args` 의 차이

보너스: `/` (positional-only) 도 있어요

1) `dict` / `weakref`가 필요하면?

코드 추가 (기존 `TextBuffer`에 붙일 메서드)

코드 스니펫 (기존 `coerce` 내용을 내부 구현으로 재사용)