λ³Έλ¬Έ λ°”λ‘œκ°€κΈ°

University of California, Berkeley/ElectricalEngineering & ComputerSciences

CS Theory

What is CS Theory ?

Up until now, we’ve talked a lot about some different tools for approaching problems, but there are still a lot of questions left unanswered

- What types of problems can we solve with computing?

- What’s the most efficient possible solution?

- How can we integrate other disciplines into computing?

- What are other ways we can perform computation?

CS theory involves thinking about these limitations without having any particular language in mind

Aspects of CS Theory

Optimization

Logic

Number Theory

Graph Theory

Type Theory

Computational Biology, Geometry, etc.

Cryptography

Quantum Theory

And today : Complexity and Computability

Disclaimer

- A large part of this lecture consists of abstractions & oversimplifications 

- Don’t take this as being too representative of the actual work done in this area

- However, it’s enough to give a sense of this topic, and if it sounds interesting, you should look into some classes in this area!

Complexity

Common Growth Patterns

1. Constant Growth

- # operations the same regardless of input size

- Grows like: k

2. Linear Growth

- Increasing input by 1 adds a constant amount to # of operations

- Grows like: ax + b

3. Logarithmic Growth

- # operations grows only when input is multiplied (doubled, tripled, etc.)

- Grows like: logb(x)

4. Exponential Growth

- Increasing input by 1 doubles (or triples, quadruples, etc.) # operations

- Grows like: kx (2x, 3x, etc.)

P vs NP

NP : "nondeterministic polynomial time", types of problems that when given a "solution" can be verified if the solution is correct quickly

P : "polynomial time", types of problems that we can create solutions to quickly

Eulerian Cycle

Problem : try to find a loop that crosses every bridge once and ends where you start

Is this problem NP ?

Step 1 : check if it's even a valid cycle

 Ex. can only cross bridges in the same area and ends where it starts

Step 2 : check that no bridges were left out

How long do these steps take?

 Step 1 : at most the number of bridges, path can't be longer than that 

 Step 2 : also at most number of bridges

- For it to be possible to have an Eulerian cycle it must be true that there aren't any islands with their own bridges that have no bridges connecting them to mainland.

Is this problem P ?

- For it to be possible to have a Eulerian cycle, you must be able to both go into a vertex and leave it (otherwise you'll get stuck)

- Or more precisely, you can't have an odd number of bridges otherwise you run out and either get stuck inside the land area or stuck outside it.

With these two conditions, you can guarantee it is possible to make an Eulerian cycle!

But how long does it take the check this conditions?

1. No floating islands with bridges

 : You have to walk across at most all the bridges to see you visited every area

2. Even number of bridges

 : You have to check every piece of land to make sure it has and even number of bridges

Together this is a process that take grows linearly in number of bridges and pieces of land!

Therefore, checking if you can make an Eulerian cycle is also in P

Hamiltonian Path

Hamiltonian Path: A path that goes through every piece of land once

Is this problem NP ?

If I have a "solution", to check if it's true :

1. Need to check it is a valid path

2. As well as it visits all the land areas

- We can try iterating over all the possible paths and checking if any one of them is a Hamiltonian path

- Problem is the number of possible paths is n! in the number of edges! 

- This strategy will take a looong time on a large setup

Q Can we do better?

A Not much better, the best known algorithm runs in exponential time!

Reductions

: We say problem A reduces to problem B if you can easily solve problem A using problem B.

NP Complete

There are a group of problems in NP which all NP problems reduce to, we call this group NP complete

Some examples of problems in this group are: 

- Hamiltonian cycle

- 3-Sat

- Travelling salesman

- Knapsack

- Subset sum ( otherwise known as count change )

Example : 3 - SAT

Given a boolean formula is it possible to set the variables such that the formula evaluates to True

Example : (a or b or not c) and (not a or not b or c)

Solution : Set a to true, b to false, c to true

Implications

- Because all problems in NP reduce to NP-complete problems, if an efficient solution to an NP-complete problem is found, all problems in NP subsequently will have an efficient solution.

- Here is an important example in NP : Public key cryptography

- If a reduction is found from an NP-complete problem to a problem in P then this will demonstrate P = NP

A side effect : This means public key cryptography will be easily solvable and a lot of computer security will be broken

A positive effect of P = NP would be a lot of currently difficult problem would be made much easier to solve

Example : Optimally scheduling buses, Can generate a mathematical proof for anything provable

This seems like it should be easy

- Turns out it is incredibly difficult to show P = NP or even P =/= NP, so difficult there is a prize.

- The prize is called the Millennium prize and has a bounty of 1 million dollars for each of the millennium problems.

- P vs NP is so powerful in fact, if you prove P = NP then you can also solve all of the other millennium problem (with 6 million dollar in total)

Summary

- We can describe problems by how long it takes to get a solution, and how long it takes to verify a solution

- Problems in NP can be verified in polynomial time (fast)

- Problems in P can be solved in polynomial time (fast)

- If P = NP, then every problem that can be easily verified can be easily solved

- There are a class of problems (NP-complete) which every problem in NP reduces to - so every problem in NP is at least as hard as these problems

Computability

Analyzing Program Behavior

: We’ve talked about ways to analyze program performance

Q What are some ways this can fall apart?

A Classic Proof ( Turing, 1936 )

Q Can we check to see if a program will always terminate?

Let’s imagine we can, and then see if we run into any issues

This approach is called a proof by contradiction

Let’s say we have a function, halt, which takes in a function f and value x, and returns True if f will terminate when called on x

https://www.cs.virginia.edu/~robins/Turing_Paper_1936.pdf

Contradictions

Implications

- The halting problem is undecidable, which means it is impossible for a computer to solve

- If we can reduce the halting problem to another problem, it must also be undecidable.

Example: suppose there exists a function computes_same which returns whether two functions, f1 and f2, return the same value for every possible input.

Reductions

Running A Line of Code

Applications To Security

- We can’t tell if a program will ever run a specific line of code, whether beneficial or harmful. 

- This means we can never make a “perfect antivirus” that can always tell if a program will run malicious code.

Q What can we do?

Signatures

- Even if we can’t tell if a program will exhibit some behavior, we can still see if it contains some line of code

- We look for patterns in the program which correspond to malicious code - these are called signatures

Problems:

1. There are many ways to write code with the same behavior

2. Sometimes harmless programs will have malicious-looking code

3. Viruses can keep changing their code to hide signatures - we call this polymorphism

https://ditam.github.io/posts/fizzbuzz/

 

https://ditam.github.io/posts/fizzbuzz/

Twenty Ways to FizzBuzz An exercise in Javascript Table of Contents It is one of the most famous challenges in programming, a common question at interviews, a display of minimal programming competency. Yet, FizzBuzz is a challenge that most programmers app

ditam.github.io

Getting Around the Halting Problem ?

: There are clearly cases where we can figure out if a program will terminate - even for an arbitrary input

Termination Analysis

- The Halting Problem: Given a program p, and an input x, return True if the program halts and False if it never terminates

- Termination analysis: Given a program p and an input x, return True if you know it halts, False if you know it never terminates, or “I don’t know” otherwise

- Goal : Minimize the number of times you have to say “I don’t know”

Another Classic Proof ( Turing 1949 )

https://fi.ort.edu.uy/innovaportal/file/20124/1/09-turing_checking_a_large_routine_earlyproof.pdf

Turing's Method

1. Generate a function, f, as in the previous slide

2. See if that function keeps decreasing and hits some “minimal value”

3. If yes, the program terminates. If no, generate another function and try that

4. At some point, either don’t know or have enough proof that you can say the function doesn’t halt

Transition Invariants

Idea : take multiple smaller functions and “or” them together

- Easier to find these smaller functions, but harder to prove it’s correct

- Most of the research now is how to speed up the proof of correctness

- This has been used to successfully debug device drivers

Size - Change Termination

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.646.9176&rep=rep1&type=pdf

Summary ( Computability )

- The halting problem is an example of an uncomputable problem: one that we can’t decide with computers

- This means we can’t decide if an arbitrary program will halt for an arbitrary input

- There are techniques which allow us to determine if certain programs will halt (even on an arbitrary input), but these won’t work for everything

- Other uncomputable problems reduce from the halting problem: such as creating a “true” antivirus or telling if two programs give identical outputs

 

'University of California, Berkeley > ElectricalEngineering & ComputerSciences' μΉ΄ν…Œκ³ λ¦¬μ˜ λ‹€λ₯Έ κΈ€

Conclusion  (0) 2019.08.17
Final Review  (0) 2019.08.13
Week 8  (0) 2019.08.13
Computer Security  (0) 2019.08.09
SQL : The Sequel  (0) 2019.08.08