Data types(2) - record, tuple, list, union, pointer, type checking

Notice

Recent Posts

Recent Comments

Link

Tags more

Archives

관리 메뉴

수정입니다

Data types(2) - record, tuple, list, union, pointer, type checking 본문

전공/프로그래밍언어론

Data types(2) - record, tuple, list, union, pointer, type checking

nongdamgom 2023. 12. 28. 12:45

Record Types

heterogeneous aggregate of data elements in which the individual elements are identified by names
동일한 타입이나 크기가 아닌 collection of data를 모델링
ex) C, C++, C# : struct
Design issues

--> What is the syntactic form of refernces to the field?

--> Are elliptical references allowed (생략 허용 or not)

Definition of Records in COBOL

01 EMPLOYEE-RECORD. 02 EMPLOYEE-NAME. 05 FIRST PICTURE IS X(20). 05 Middle PICTURE IS X(10). 05 LAST PICTURE IS X(20). 02 HOURLY-RATE PICTURE IS 99V99.

level numbers 를 이용하여 nested record 표현
others use recursive definition

References to Records

COBOL : MIDDLE OF EMPLOY-NAME OF EMPLOYEE-RECORD
Others : Employee_Record.Employee_Name.Middle (dot notation)

* Fully qualified references : 중첩의 제일 밖 record부터 중간 모든 record를 전부 참조(위 예시들)

* elliptical references : MIDDLE, MIDDLE OF EMPLOY-NAME, MIDDLE OF EMPLOYEE-RECORD

Evaluation and Comparison to Arrays

Array

--> 모든 데이터 값이 동일한 타입/ 동일한 방식으로 처리될 때 사용

--> 구조에 순차적으로 접근하는 체계적인 방법이 존재할 때 용이

--> subscripts 가 dynamic이라 record보다 느림

record

--> 데이터 값들이 다 다르고, 각 field가 동일한 방식일 필요 x

--> record field들은 특정 순서로 처리 될 필요 x

--> record는 static subscripts라서 array보다 빠름 (dynamic도 되긴하는데 type checking을 허용x, 더 느림)

Implementation of Record Type

struct Student{ int studentID; // offset 0 char name[50]; // offest 4 float gpa; // offset 54 }

record의 field들은 인접한 메모리 주소에 위치함
field들의 사이즈는 다 다를 수 있음
offset address는 각 field들의 시작 주소와 relative 함 (절대적 x)

Tuple Types

element의 이름이 record의 field 처럼 명명되지 않는다는 점을 빼면 record랑 비슷
여러개의 값 반환 허용하기 위해서 python, ML, F# 에서 사용
ex) Python

--> 파이썬 list와 비슷, 하지만 tuple은 immutable하다

--> tuple과 list는 서로 변경불가/허용에 따라 자유롭게 타입 바꿔쓸 수 있음

myTuple = (3, 5.8, 'apple') test = (1,2,3) del test[2] // test = (1,2) print(myTuple + test) // (3, 5.8, 'apple', 1, 2)

--> catenation, del 가능

ex) ML

val myTuple = (3, 5.8, 'apple'); #1(myTuple) // first element 참조 type intReal = int * real; // int 1, real 1로 구성된 새로운 tuple타입 정의 가능

--> * : just a separator

let tup = (3, 5, 7) let a, b, c = tup //a = 3, b = 5, c = 7

List Types

함수형 언어에서 먼저 시작
Lisp , Scheme

(A B C D) (A (B C) D)

--> 소괄호로 구분됨. comma 사용 x

--> (A B C) 가 code로서 해석되면 , parameter B,C를 가진 function A라는 뜻이 됨

--> list로서 사용하려면, '(A B C) 이렇게 앞에 ' 붙여줘야함

ex) python

--> Scehme, Common Lisp, ML, F#과 다르게 python의 list는 mutable 함

myList = [3, 5.8, "grape"] // any type can be allowed x = myList[1] // x = 5.8 del myList[1] // remove 5.8

--> list comprehension 가능

[x * x for x in range(7) if x % 3 == 0] // range(7) -> [0,1,2,3,4,5,6] // constructed list : [0,9,36]

Union Type

프로그램 실행 중, 다른 시기에 다른 타입의 값을 저장할 수 있는 타입
Design issue

--> should type checking be required?

difference with struct

Discriminated vs Free Unions

free union : C, C++

--> no language support for type checking

discriminated union : ML, Haskell, F#

--> dicriminant를 이용해 type checking

Evaluation of Union

--> free union are unsafe

--> Java , C# do not support union

Pointer Type

has a range of values that consists of memory addrsses and a special value, null
indirect addressing
heap 에 접근하는데 사용 (dynamic)

Design Issues of Pointers

What are the scope of and lifetime of a pointer variable?
What is the lifetime of a heap-dynamic variable?
Are pointers restricted as to the type of value to which they can point?
Are pointers used for dynamic storage management, indirect addressing, or both?
Should the language support pointer types, reference types, or both?

Pointer operations

assignment, dereferencing(역참조)
assignment : set a pointer variable's value to some useful address
dereferencing : implicit or explicit

--> implicit : int arr[5] (it means int *arr = &arr[5])

--> explicit : (int *ptr = &i 일 때) j = *ptr // *ptr이 가리키는 값 역참조

Problems with Pointers

Dangling pointers(dangerous)

--> pointer가 이미 dealloc 된 heap - dynamic varible을 가리키고 있음

Lost heap - dynamic variable

--> pointer p1이 heap dynamic var A를 가리키다가 B를 가리키면 A는 이제 접근할 수 x(memory leakage)

Pointers in C and C++

위 문제들에 매우 유연함 (해결책 딱히 x)
pointers can point at any variable(할당 되어있든 아니든 걍 다 가리킬 수 있음)
pointer 끼리 산술연산 가능
explicit dereferencing and address-of operators
domain type need not be fixed(void *)

--> void * : 어느 타입이나 가리킬 수 있는 포괄형 포인터(역참조 허용 x라 타입 검사 문제 없음)

Pointer Arithmetic in C and C++

int x[10] = {0,1,2,3,4,5,6,7,8,9}; int *p; p = x; // p = &x[0], *p = x[0] cout << *(p+5); // *(p+5) = x[5]

Reference Types

C++ : special kind of pointer type
Java : 주소 참조x, object 참조 (class instance)
In C++

--> pointer can have null (int *ptr;)

--> reference can not have null (int &ref;) - alias니까 당연함..

Evaluation of pointers, reference

dangling pointers & dangling object are problem as is heap management
pointer는 goto랑 비교되기도함
pointer & reference => dynamic data structure에는 필수적

Heap Management

very complex run-time process
Two approaches to reclaim garbage
Reference counters(eager approach)

--> reclamation is gradual

--> 접근 불가 cell이 생성 될 때마다 점진적으로 회수

--> counter를 계속 유지 시키면서 pointer가 어떤 cell을 현재 가리키고 있는지/아닌지 개수 체크

Object* obj1 = new Object(); // RefCount(obj1) start at 1 Object* obj2 = obj1; // RefCount(obj1) incremented to 2 as new reference is add Object* obj3 = new Object(); obj2 = NULL; // RefCount(obj1) decremented to 1 as ref goes away obj1 = obj3; // RefCount(obj1) decremented to 0 and can be collected

--> counter가 0이 되면 그 cell이 다시 가용공간 리스트에 반환

--> 장점 : 점진적으로 증가시키는거라 application이 실행될 때 상당한 지연을 피할 수 있음

--> 단점 : counter 에 대한 공간, 시간이 요구되고, cell들이 순환적으로 연결 돼 있으면 복잡해짐

Mark - sweep(lazy approach)

--> 사용가능한 기억공간의 리스트가 소진될 때만 회수

--> collection algorithm 에 의해 indicator에 extra 1 bit를 사용

--> 모든 cell을 일단 0으로 setting 해 놓고

--> pointer가 heap에서 reachable 하면 1로 setting (not ggarbage)

--> garbage cell은 available cell로 반환된다.

Dashed lines show the order of node_marking

--> 단점 : 처음버전은 이걸 드물게 사용해서 문제가 됨. (거의 모든 가용 공간을 다 쓸 때 하니까) 대부분의 cell들을 추적하고 표시해야 하기 때문에 상당한 시간이 소요된다.

=> 이걸 피하기 위해 incremental mark-sweep을 사용

--> 메모리가 소진되기 훨씬 이전에 더 빈번히 수행함.

Type Checking

operand operaotr operand( a + b )
the activity of ensuring that the perands of an operator are of compatible types
compatible type : 딱 봐도 되는거랑, 컴파일러에 의해 implicit하게 변환되는게 허용되는 타입들

--> 이 automatic conversion : coercion

type error : operator랑 operand랑 적합하지 않은 타입일 때 발생
type bindig 이 static이면 type checking도 static
type bindig 이 dynamic이면 type checking도 dynamic
strong type : type error가 항상 detect 되는 언어

--> 장점 : type error로 인한 오용을 감지할 수 있음

Strong Typing

C, C++ : strong type아님

--> union 은 type checking이 되지 않는다

Java, C# : 거의 strong type (explicit type casting 때문에 error 발생할 수 있지만, type error를 detect하지 못하는 implicit한 방법은 x)
ML, F# : strong type
Coercion rule은 strong typing을 약화 시킴

int a, b; float d; a + d // int가 float으로 강제 변환 돼서 오류 탐지 x (원래 a + b 쓰려 했는데 실수했다는 가정)

Java는 C++의 절반정도만 coercion rule을 허용하지만, 이게 strong type을 약화 시켜서 강제 변환이 거의 없는 Ada, ML, F# 같은 언어보다는 오류탐지에 효과가 떨어짐

Name and Structure Type Equivalence

Name type equivalence

--> 두 변수가 동일한 선언문 or 동일한 타입 이름을 갖는 선언문에 정의되면 이거임

--> 구현이 쉽지만 제약이 많다

Sturcture type equivalence

--> 두 변수의 타입이 동일한 구조를 가질 때

--> more flexible, 구현이 어렵다

Type Equivalence

Are two record types equivalent if they are structurally the same but use different field names?

--> Yes

Are two array types equivalent if they are the same except that the subscripts are different?

--> Yes (인덱싱이 0부터 9까지냐, 1부터 10까지냐 정도의 차이 )

Are two enumeration types equivalent if their components are spelled differently?

--> Yes

structural type이 동등하다면, 이 동일한 구조를 갖는 타입을 구별하는 것을 허용하지 않는다

--> 위 같은 작은 차이들을 무시하기 때문에 structural type equivalence은 더 제한적인 상황에서 사용됨

Theory and Data Types

Type theory is a broad area of study in mathematics, logic computer science, philosophy
In computer science

--> Parctical : commercial(상업적) 언어의 데이터 타입과 연관

--> Abstract : lambda 계산에 중점

a type system is a set of types and the rules that govern their use in programs

Summary

언어의 데이터 타입은 그 언어의 스타일과 유용성을 결정하는 데 있어 많은 부분을 차지함
명령형 언어의 primitive data type은 대부분 numeric, character, boolean
user-defined enumeration & subrange type은 편리하고 readability 랑 reliability를 높여줌
array 랑 record는 대부분의 언어가 포함하고 있음
pointer는 addressing의 flexibility와 dynamic storage management를 control하기 위해 사용함

// end of chapter 6

'전공 > 프로그래밍언어론' 카테고리의 다른 글

Statement - Level Control Structures (1)	2023.12.28
Expressions and Assignment Statements (1)	2023.12.28
Data Types(1) - numeric, string, array (1)	2023.12.28
Name, Bindings, and Scope(2) - Scope (0)	2023.12.28
Name, Bindings, and Scope(1) - Name, Bindings (0)	2023.12.28

'전공/프로그래밍언어론' Related Articles

수정입니다

Data types(2) - record, tuple, list, union, pointer, type checking 본문

Data types(2) - record, tuple, list, union, pointer, type checking

'전공 > 프로그래밍언어론' 카테고리의 다른 글

티스토리툴바