找到你要的答案

Q:Capturing rules of graph using types in Scala, OCaml and Haskell

Q:捕获在斯卡拉使用类型图规则、Ocaml和Haskell

I am trying to describe a complex graph with many different types of nodes and edges which can only be connected to each other according to a set of rules. I would like these rules to be checked at compile time using the type system of the language. There are many different node and edge types in my real application.

I have easily created a simple example in Scala:

sealed trait Node {
  val name: String
}
case class NodeType1(override val name: String) extends Node
case class NodeType2(override val name: String) extends Node
case class NodeType3(override val name: String) extends Node

sealed trait Edge
case class EdgeType1(source: NodeType1, target: NodeType2) extends Edge
case class EdgeType2(source: NodeType2, target: NodeType1) extends Edge

object Edge {
  def edgeSource(edge: Edge): Node = edge match {
    case EdgeType1(src, _) => src
    case EdgeType2(src, _) => src
  }
}

object Main {
  def main(args: Array[String]) {
    val n1 = NodeType1("Node1")
    val n2 = NodeType2("Node2")
    val edge = EdgeType1(n1, n2)
    val source = Edge.edgeSource(edge)
    println(source == n1)  // true
  }
}

A valid graph can only connect a given edge type between the given types of nodes as shown in the Scala example above. The function "edgeSource" extracts the source node from the edge, as simple as that.

Here comes a non working example of what I would like to write in OCaml:

type node =
    NodeType1 of string
  | NodeType2 of string

type edge =
    EdgeType1 of NodeType1 * NodeType2
  | EdgeType2 of NodeType2 * NodeType1

let link_source (e : edge) : node =
  match e with
  | EdgeType1 (src, _) -> src
  | EdgeType2 (src, _) -> src

The problem here is that "NodeTypeX" are constructors and not types. Hence, I am unable to use them when I describe the tuples with source and target for which the edges are defined. The "link_source" function can only return one type and "node" is the variant which can return something.

I have been trying out how to fix this in both OCaml and Haskell and here is an example of one go in OCaml where the node type wraps node_type_X:

type node_type_1 = NodeType1 of string
type node_type_2 = NodeType2 of string

type node =
    NodeType1Node of node_type_1
  | NodeType2Node of node_type_2

type edge =
    EdgeType1 of node_type_1 * node_type_2
  | EdgeType2 of node_type_2 * node_type_1

let link_source (e : edge) : node =
  match e with
  | EdgeType1 (src, _) -> NodeType1Node src
  | EdgeType2 (src, _) -> NodeType2Node src

But the problem with this is that I am duplicating the type information. I am specifying the source node type in the definition of edge, and it is also given when matching the edge in link_source as NodeTypeXNode.

Obviously I am not understanding how to solve this problem. I am stuck thinking in class hierarchies. What would be the correct way to express what I am achieving in the Scala code above in OCaml or Haskell?

我试图描述一个复杂的图形与许多不同类型的节点和边缘,只能相互连接,根据一组规则。我想使用编译语言的类型系统在编译时检查这些规则。在我的实际应用中有许多不同的节点和边缘类型。

我很容易地创建一个简单的例子斯卡拉:

sealed trait Node {
  val name: String
}
case class NodeType1(override val name: String) extends Node
case class NodeType2(override val name: String) extends Node
case class NodeType3(override val name: String) extends Node

sealed trait Edge
case class EdgeType1(source: NodeType1, target: NodeType2) extends Edge
case class EdgeType2(source: NodeType2, target: NodeType1) extends Edge

object Edge {
  def edgeSource(edge: Edge): Node = edge match {
    case EdgeType1(src, _) => src
    case EdgeType2(src, _) => src
  }
}

object Main {
  def main(args: Array[String]) {
    val n1 = NodeType1("Node1")
    val n2 = NodeType2("Node2")
    val edge = EdgeType1(n1, n2)
    val source = Edge.edgeSource(edge)
    println(source == n1)  // true
  }
}

一个有效的图只能连接一个给定的边缘型给定类型的节点之间如Scala的例子。功能edgesource”提取源节点的边缘,就这么简单。

这里是一个非工作的例子,我想写在OCaml:

type node =
    NodeType1 of string
  | NodeType2 of string

type edge =
    EdgeType1 of NodeType1 * NodeType2
  | EdgeType2 of NodeType2 * NodeType1

let link_source (e : edge) : node =
  match e with
  | EdgeType1 (src, _) -> src
  | EdgeType2 (src, _) -> src

这里的问题是,“nodetypex”而不是类型构造函数。因此,我不能用他们当我描述元组的源和目标的边缘定义。“link_source”函数只能返回一个类型和“节点”的变种可退货。

我一直在尝试如何解决这两Ocaml和Haskell和这里是一个去例,结型包装node_type_x OCaml:

type node_type_1 = NodeType1 of string
type node_type_2 = NodeType2 of string

type node =
    NodeType1Node of node_type_1
  | NodeType2Node of node_type_2

type edge =
    EdgeType1 of node_type_1 * node_type_2
  | EdgeType2 of node_type_2 * node_type_1

let link_source (e : edge) : node =
  match e with
  | EdgeType1 (src, _) -> NodeType1Node src
  | EdgeType2 (src, _) -> NodeType2Node src

但问题是,我复制的类型信息。我在边缘的定义指定源节点类型,并给出在匹配的边缘link_source NodeTypeXNode。

显然我不明白如何解决这个问题。我坚持在阶级层次上思考。什么是正确的方式去表达我所实现的Scala代码在OCaml或Haskell?

answer1: 回答1:

I think the most straightforward translation of your Scala version is using phantom types to mark the node and edge type and bind them to specific constructors with GADTs.

{-# LANGUAGE GADTs #-}
{-# LANGUAGE DataKinds #-}

data Type = Type1 | Type2

data Edge t where
    EdgeType1 :: Node Type1 -> Node Type2 -> Edge Type1
    EdgeType2 :: Node Type2 -> Node Type1 -> Edge Type2

data Node t where
    NodeType1 :: String -> Node Type1
    NodeType2 :: String -> Node Type2

instance Eq (Node t) where
    NodeType1 a == NodeType1 b = a == b
    NodeType2 a == NodeType2 b = a == b

edgeSource :: Edge t -> Node t
edgeSource (EdgeType1 src _) = src
edgeSource (EdgeType2 src _) = src

main :: IO ()
main = do
    let n1   = NodeType1 "Node1"
        n2   = NodeType2 "Node2"
        edge = EdgeType1 n1 n2
        src  = edgeSource edge

    print $ src == n1

This is now actually safer than the Scala version since we know the exact type returned from edgeSource statically instead of just getting an abstract base class that we would need to type-cast or pattern match against.

If you want to mimic the Scala version exactly, you could hide the phantom type in an existential wrapper to return a generic, "unknown" Node from edgeSource.

{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE FlexibleInstances #-}

data Some t where
    Some :: t x -> Some t

edgeSource :: Edge t -> Some Node
edgeSource (EdgeType1 src _) = Some src
edgeSource (EdgeType2 src _) = Some src

label :: Node t -> String
label (NodeType1 l) = l
label (NodeType2 l) = l

instance Eq (Some Node) where
    Some n1 == Some n2 = label n1 == label n2

我认为你的Scala版本最直白的翻译是使用假体类型标记的节点和边缘型和绑定到特定的建设者GADTs。

{-# LANGUAGE GADTs #-}
{-# LANGUAGE DataKinds #-}

data Type = Type1 | Type2

data Edge t where
    EdgeType1 :: Node Type1 -> Node Type2 -> Edge Type1
    EdgeType2 :: Node Type2 -> Node Type1 -> Edge Type2

data Node t where
    NodeType1 :: String -> Node Type1
    NodeType2 :: String -> Node Type2

instance Eq (Node t) where
    NodeType1 a == NodeType1 b = a == b
    NodeType2 a == NodeType2 b = a == b

edgeSource :: Edge t -> Node t
edgeSource (EdgeType1 src _) = src
edgeSource (EdgeType2 src _) = src

main :: IO ()
main = do
    let n1   = NodeType1 "Node1"
        n2   = NodeType2 "Node2"
        edge = EdgeType1 n1 n2
        src  = edgeSource edge

    print $ src == n1

这是现在实际上比Scala版本更安全,因为我们知道从静而不只是得到一个抽象基类,我们需要类型转换或模式匹配对edgesource返回的确切类型。

如果你想模仿的Scala版本完全,你可以躲在一个存在主义的包装幻影类型返回一个通用的,“未知”的节点edgesource。

{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE FlexibleInstances #-}

data Some t where
    Some :: t x -> Some t

edgeSource :: Edge t -> Some Node
edgeSource (EdgeType1 src _) = Some src
edgeSource (EdgeType2 src _) = Some src

label :: Node t -> String
label (NodeType1 l) = l
label (NodeType2 l) = l

instance Eq (Some Node) where
    Some n1 == Some n2 = label n1 == label n2
answer2: 回答2:

Edit: the answer with GADTs is much more direct.

Here's a Haskell version (without unsafeCoerce), which is one possible translation of your Scala code. I can't help with an OCaml solution however.

Note that, in Haskell, == cannot be used on values of different type (and the ability to do so in Scala is frequently frowned upon and a source of annoyance and bugs). However, I've provided a solution below for comparing different node types if you really need it. If you don't truly need it, I'd recommend avoiding it, as it depends on GHC features/extensions that make your code less portable and could potentially even cause problems for the type checker.

WITHOUT polymorphic node comparison:

{-# LANGUAGE TypeFamilies, FlexibleContexts #-}
-- the FlexibleContexts extension can be eliminated
-- by removing the constraint on edgeSource.

-- let's start with just the data types
data NodeType1 = NodeType1 { name1 :: String } deriving Eq
data NodeType2 = NodeType2 { name2 :: String } deriving Eq
data NodeType3 = NodeType3 { name3 :: String } deriving Eq

data EdgeType1 = EdgeType1 { source1 :: NodeType1, target1 :: NodeType2 }
data EdgeType2 = EdgeType2 { source2 :: NodeType2, target2 :: NodeType1 }

-- you tell the compiler that the node types
-- somehow "belong together" by using a type class
class    Node a         where name :: a -> String
instance Node NodeType1 where name = name1
instance Node NodeType2 where name = name2
instance Node NodeType3 where name = name3

-- same about the edges, however in order to
-- map each Edge type to a different Node type,
-- you need to use TypeFamilies; see
-- https://wiki.haskell.org/GHC/Type_families
class Edge a where
  type SourceType a
  -- the constraint here isn't necessary to make
  -- the code compile, but it ensures you can't
  -- map Edge types to non-Node types.
  edgeSource :: Node (SourceType a) => a -> SourceType a

instance Edge EdgeType1 where
  type SourceType EdgeType1 = NodeType1
  edgeSource = source1

instance Edge EdgeType2 where
  type SourceType EdgeType2 = NodeType2
  edgeSource = source2

main = do
  let n1     = NodeType1 "Node1"
      n2     = NodeType2 "Node2"
      edge   = EdgeType1 n1 n2
      source = edgeSource edge
  print (source == n1)  -- True
--  print (source == n2)  -- False  -- DOESN'T COMPILE

WITH polymorphic node comparison:

{-# LANGUAGE MultiParamTypeClasses, FlexibleInstances #-}

-- again, constraint not required but makes sure you can't
-- define node equality for non-Node types.
class (Node a, Node b) => NodeEq a b where
  nodeEq :: a -> b -> Bool

-- I wasn't able to avoid OVERLAPPING/OVERLAPS here.
-- Also, if you forget `deriving Eq` for a node type N,
-- `nodeEq` justs yield False for any a, b :: N, without warning.
instance {-# OVERLAPPING #-} (Node a, Eq a)   => NodeEq a a where
  nodeEq = (==)
instance {-# OVERLAPPING #-} (Node a, Node b) => NodeEq a b where
  nodeEq _ _ = False

main = do
  let n1     = NodeType1 "Node1"
      n2     = NodeType2 "Node2"
      edge   = EdgeType1 n1 n2
      source = edgeSource edge
  print (source `nodeEq` n1)  -- True
  print (source `nodeEq` n2)  -- False

The abov is not the only way to tell the Haskell type system about your constraints, for example functional dependencies seem applicable, and GADTs.


Explanation:

It's worth understanding why the solution seems to be more direct in Scala.

Scala's a hybrid between subtype polymorphism based OO such as the one found in C++, Java/C#, Python/Ruby, and (often Haskell-like) functional programming, which typically avoids subtyping a.k.a. datatype inheritance, and resorts to other, arguably better, forms of polymorphism.

In Scala, the way you define ADTs is by encoding them as a sealed trait + a number of (potentially sealed) case classes and/or case objects. However, this is only a pure ADT only if you never refer to the types of the case objects and case classes, so as to pretend they are like the Haskell or ML ADTs. However, your Scala solution indeed uses those types, i.e. it points "into" the ADT.

There's no way to do that in Haskell as individual constructors of an ADT do not have a distinct type. Instead, if you need to type-distinguish between individual constructors of an ADT, you need to split the original ADT into separate ADTs, one per constructor of the original ADT. You then "group" these ADTs together, so as to be able to refer to all of them in your type signatures, by putting them in a type class, which is a form of ad-hoc polymorphism.

编辑:GADTs的回答更直接。

这里是一个Haskell版本(无unsafecoerce),这是你的Scala代码的一个可能的翻译。我不能帮助但OCaml的解。

在Haskell,注意,= =不能用于不同类型的值(在斯卡拉这样做的能力是经常皱眉、烦恼和错误的来源)。然而,我提供了一个解决方案,用于比较不同的节点类型,如果你真的需要它。如果你不真的需要它,我建议你避开它,因为它依赖于GHC的特点/扩展,使你的代码更便携,甚至潜在的类型检查器引起问题。

无多态节点比较:

{-# LANGUAGE TypeFamilies, FlexibleContexts #-}
-- the FlexibleContexts extension can be eliminated
-- by removing the constraint on edgeSource.

-- let's start with just the data types
data NodeType1 = NodeType1 { name1 :: String } deriving Eq
data NodeType2 = NodeType2 { name2 :: String } deriving Eq
data NodeType3 = NodeType3 { name3 :: String } deriving Eq

data EdgeType1 = EdgeType1 { source1 :: NodeType1, target1 :: NodeType2 }
data EdgeType2 = EdgeType2 { source2 :: NodeType2, target2 :: NodeType1 }

-- you tell the compiler that the node types
-- somehow "belong together" by using a type class
class    Node a         where name :: a -> String
instance Node NodeType1 where name = name1
instance Node NodeType2 where name = name2
instance Node NodeType3 where name = name3

-- same about the edges, however in order to
-- map each Edge type to a different Node type,
-- you need to use TypeFamilies; see
-- https://wiki.haskell.org/GHC/Type_families
class Edge a where
  type SourceType a
  -- the constraint here isn't necessary to make
  -- the code compile, but it ensures you can't
  -- map Edge types to non-Node types.
  edgeSource :: Node (SourceType a) => a -> SourceType a

instance Edge EdgeType1 where
  type SourceType EdgeType1 = NodeType1
  edgeSource = source1

instance Edge EdgeType2 where
  type SourceType EdgeType2 = NodeType2
  edgeSource = source2

main = do
  let n1     = NodeType1 "Node1"
      n2     = NodeType2 "Node2"
      edge   = EdgeType1 n1 n2
      source = edgeSource edge
  print (source == n1)  -- True
--  print (source == n2)  -- False  -- DOESN'T COMPILE

具有多态节点比较:

{-# LANGUAGE MultiParamTypeClasses, FlexibleInstances #-}

-- again, constraint not required but makes sure you can't
-- define node equality for non-Node types.
class (Node a, Node b) => NodeEq a b where
  nodeEq :: a -> b -> Bool

-- I wasn't able to avoid OVERLAPPING/OVERLAPS here.
-- Also, if you forget `deriving Eq` for a node type N,
-- `nodeEq` justs yield False for any a, b :: N, without warning.
instance {-# OVERLAPPING #-} (Node a, Eq a)   => NodeEq a a where
  nodeEq = (==)
instance {-# OVERLAPPING #-} (Node a, Node b) => NodeEq a b where
  nodeEq _ _ = False

main = do
  let n1     = NodeType1 "Node1"
      n2     = NodeType2 "Node2"
      edge   = EdgeType1 n1 n2
      source = edgeSource edge
  print (source `nodeEq` n1)  -- True
  print (source `nodeEq` n2)  -- False

前述不告诉Haskell的类型系统对你的约束的唯一途径,例如函数依赖是适用的,GADTs。


解释:

这是值得理解为什么解决方案在斯卡拉似乎是更直接的。

斯卡拉的亚型多态性等面向对象的C++中发现之间的混合,# java / C,Python,Ruby,和(通常是Haskell等)功能的编程,通常避免又名类型继承分型,和度假村等,可能更好,形式的多态性。

在斯卡拉,你确定ADTs是通过编码为一个密封的特质+数(可能密封)类和/或实例对象。然而,这只是一个纯粹的ADT只有你绝不是指案件的对象和类的类型,以便假装喜欢Haskell或毫升ADT。然而,你确实使用Scala解决这些类型,即点”为“ADT。

有没有办法在Haskell作为一个ADT个人构造函数没有显著的类型。相反,如果你需要的类型区分一个ADT个人构造函数之间,你需要分割成独立的原ADT ADT,每一个原始的ADT的构造函数。然后你的“集团”这些概念联系在一起,从而能够参考你的类型签名给他们,将他们放在一个类,这是一种特别的多态性。

answer3: 回答3:

You were asking too much of the Ocaml type system. At this point in your second attempt:

let link_source (e : edge) : node =
match e with
| EdgeType1 (src, _) -> 

you are saying: It should be clear that src is of node_type_1, and I gave the return type node, so the compiler should be able to sort out the correct constructor to use from the type of src. However this is not possible in general: In a given variant, there is not a unique mapping from 'member types' to constructors; for example: type a = A of int | B of int. So you do have to specify the constructor (you could name it shorter).

If you don't want that you have to use polymorphism. One option is make the src_link function polymorphic. One attempt would be

type e12 = node_type_1 * node_type_2
type e21 = node_type_2 * node_type_1

let link_source = fst

but then you have to expose the link types as tuples separately. Another option is to use polymorphic variants.

type node1 = [`N1 of string]
type node2 = [`N2 of string]
type node3 = [`N3 of string]
type node = [node1 | node2 | node3]
type edge = E12 of node1 * node2 | E21 of node2 * node1

then one could write

let link_source (e:edge) : [<node] = match e with
  | E12 (`N1 s, _) -> `N1 s 
  | E21 (`N2 s, _) -> `N2 s

this automaticallly unifies the return type and checks that it's an existing node. The last pattern match can also be handled by type coercion:

let link_source (e:edge) : node = match e with
  | E12 (n1, _) -> (n1:>node)
  | E21 (n2, _) -> (n2:>node)

GADTs can also help. With the same definitions for node{,1,2,3} above one can define

type ('a, 'b) edge =
  | E12 : node1 * node2 -> (node1, node2) edge
  | E21 : node2 * node1 -> (node2, node1) edge

and then a polymorphic

let link_source : type a b . (a, b) edge -> a = function
  | E12 (n1, _) -> n1
  | E21 (n2, _) -> n2

addendum: when using GADTs it's not necessary to use polymorphic variants. So, one can just have

type node1 = N1 of string
type node2 = N2 of string
type node3 = N3 of string

and the same definitions of edge and link_source will work.

你问了太多的OCaml型系统。在这一点上你的第二次尝试:

let link_source (e : edge) : node =
match e with
| EdgeType1 (src, _) -> 

你说:应该明确的是,SRC是node_type_1,我将返回类型的节点,所以编译器应该能够解决使用Src键入正确的构造函数。然而,这一般是不可能的:在一个给定的变体,没有从一个独特的映射类型成员的构造函数;例如:A = B的int int |所以你必须指定的构造函数(你可以叫它短)。

如果你不想你必须使用多态性。一种选择是让src_link功能多态性。一个尝试将

type e12 = node_type_1 * node_type_2
type e21 = node_type_2 * node_type_1

let link_source = fst

但你有暴露的链接类型分别为元组。另一种选择是使用多态变体。

type node1 = [`N1 of string]
type node2 = [`N2 of string]
type node3 = [`N3 of string]
type node = [node1 | node2 | node3]
type edge = E12 of node1 * node2 | E21 of node2 * node1

然后一个人可以写

let link_source (e:edge) : [<node] = match e with
  | E12 (`N1 s, _) -> `N1 s 
  | E21 (`N2 s, _) -> `N2 s

这种自动结合的返回类型和检查,这是一个现有的节点。最后的模式匹配也可以通过类型强制处理:

let link_source (e:edge) : node = match e with
  | E12 (n1, _) -> (n1:>node)
  | E21 (n2, _) -> (n2:>node)

GADTs也可以帮助。与节点{相同的定义,1,2,3 }以上可以定义

type ('a, 'b) edge =
  | E12 : node1 * node2 -> (node1, node2) edge
  | E21 : node2 * node1 -> (node2, node1) edge

然后是多态

let link_source : type a b . (a, b) edge -> a = function
  | E12 (n1, _) -> n1
  | E21 (n2, _) -> n2

附录:当使用GADTs没必要使用多态性变异。所以,一个人可以拥有

type node1 = N1 of string
type node2 = N2 of string
type node3 = N3 of string

和边缘相同的定义和link_source工作。

scala  haskell  types  ocaml