can  accurately  capture  the  correlation  between 
domains and enhance the performance of the model. 
Table 2: Compared results of different models on SMD. 
Model  BLEU  F1 
Navigate 
F1
 
Weather 
F1
 
Schedule 
F1
Mem2Seq 
(Madotto et al., 2018) 
12.6  33.4
  20  32.8  49.3 
KB-retriever 
(Qin et al, 2019) 
13.9  53.7
  54.5  52.2  55.6 
GLMP 
(Wu et al., 2019) 
13.9  60.7
  54.6  56.5  72.5 
DF-Net 
(Qin et al., 2020) 
15.2  60.0
  56.5  52.8  72.6 
MDN 
(ours) 
16.0  61.2  55.1  56.6  75.6 
Table 3: Compared results of different models on Multi-
WOZ2.1. 
Model  BLEU  F1 
Restaurant 
F1
 
Attraction 
F1
Hotel 
F1
Mem2Seq 
(Madotto et al., 2018) 
6.6  21.6
  22.4  22.0  21.0 
GLMP 
(Wu et al., 2019) 
6.9  32.4
  38.4  24.4  28.1 
DF-Net 
(Qin et al., 2020) 
7.8  34.2
  37.4  40.3  30.4 
MDN  
(ours) 
8.9  34.2
  34.5  35.4  33.8 
5  CONCLUSION 
In  this  work,  we  propose  a  multi-domain  data 
enhanced  network  to  explicitly  strengthen  domain 
knowledge  for  multi-domain  dialogues.  We  adopt 
attention  mechanism  to  evaluate  the  correlation 
between the current input and each domain, using the 
correlation  as  a  criterion  for  individual-domain 
feature  generation.  In  addition,  both  encoder  and 
decoder  use  query  vectors  to  retrieve  external 
knowledge  to  improve  response  accuracy. 
Experiments on two public datasets demonstrate that 
our model outperforms the prior models. Besides, our 
model is highly adaptable to different domains since 
it  uses  the  semantic  similarity  between  domains  to 
accomplish  knowledge  transfer  in  the  specific 
domains with small datasets. 
REFERENCES 
Bordes, A., Boureau, Y.-L., & Weston, J. (2016). Learning 
end-to-end  goal-oriented  dialog.  arXiv  preprint 
arXiv:1605.07683.  
Chen, H., Liu, X., Yin, D., & Tang, J. (2017). A survey on 
dialogue systems: Recent advances and new frontiers. 
Acm Sigkdd Explorations Newsletter, 19(2), 25-35.  
Chung,  J.,  Gulcehre,  C.,  Cho,  K.,  &  Bengio,  Y.  (2014). 
Empirical evaluation of gated recurrent neural networks 
on  sequence  modeling.  arXiv  preprint 
arXiv:1412.3555.  
Eric,  M.,  &  Manning,  C.  D.  (2017).  Key-value  retrieval 
networks  for  task-oriented  dialogue.  arXiv  preprint 
arXiv:1705.05414.  
Gu,  J.,  Lu,  Z.,  Li,  H.,  &  Li,  V.  O.  (2016).  Incorporating 
copying mechanism in sequence-to-sequence learning. 
arXiv preprint arXiv:1603.06393.  
Guo, J.,  Shah,  D. J.,  &  Barzilay, R.  (2018).  Multi-source 
domain  adaptation  with  mixture  of  experts.  arXiv 
preprint arXiv:1809.02256. 
Jin, D., Gao, S., Kim, S., Liu, Y., & Hakkani-Tur, D. 
(2021). Towards zero and few-shot knowledge-seeking 
turn  detection  in  task-orientated  dialogue  systems. 
arXiv preprint arXiv:2109.08820.  
Joshi, C. K., Mi, F., & Faltings, B. (2017). Personalization 
in  goal-oriented  dialog.  arXiv  preprint 
arXiv:1706.07503.  
Kingma,  D.  P.,  &  Ba,  J.  (2014).  Adam:  A  method  for 
stochastic  optimization.  arXiv  preprint 
arXiv:1412.6980.  
Kulhánek,  J.,  Hudeček,  V.,  Nekvinda,  T.,  &  Dušek,  O. 
(2021).  AuGPT:  Auxiliary  Tasks  and  Data 
Augmentation  for  End-To-End  Dialogue  with  Pre-
Trained  Language  Models.  arXiv  preprint 
arXiv:2102.05126.  
Le, Q., & Mikolov, T. (2014). Distributed representations 
of  sentences  and  documents.  Paper  presented  at  the 
International conference on machine learning. 
Lei, W.,  Jin, X., Kan, M.-Y., Ren, Z.,  He, X., & Yin, D. 
(2018).  Sequicity:  Simplifying  task-oriented  dialogue 
systems  with  single  sequence-to-sequence 
architectures. Paper presented at the Proceedings of the 
56th  Annual  Meeting  of  the  Association  for 
Computational Linguistics (Volume 1: Long Papers). 
Madotto,  A.,  Wu,  C.-S.,  &  Fung,  P.  (2018).  Mem2seq: 
Effectively incorporating knowledge bases into end-to-
end  task-oriented  dialog  systems.  arXiv  preprint 
arXiv:1804.08217.  
Papineni, K., Roukos, S., Ward, T., & Zhu, W.-J. (2002). 
Bleu:  a  method  for  automatic  evaluation  of  machine 
translation. Paper  presented at  the Proceedings  of  the 
40th  annual  meeting  of  the  Association  for 
Computational Linguistics. 
Park, N., & Kim, S. (2022). How Do Vision Transformers 
Work? arXiv preprint arXiv:2202.06709.  
Qin,  L.,  Che,  W.,  Li,  Y.,  Wen,  H.,  &  Liu,  T.  (2019).  A 
stack-propagation  framework  with  token-level  intent 
detection  for  spoken  language  understanding.  arXiv 
preprint arXiv:1909.02188.  
Qin,  L.,  Xu,  X.,  Che,  W.,  Zhang,  Y.,  &  Liu,  T.  (2020). 
Dynamic  Fusion  Network  for  Multi-Domain  End-to-
end Task-Oriented Dialog.  
Raghu, D., & Gupta, N. (2018). Disentangling language and 
knowledge  in  task-oriented  dialogs.  arXiv  preprint 
arXiv:1805.01216.  
Serban,  I.  V.,  Sordoni,  A.,  Bengio,  Y.,  Courville,  A.,  & 
Pineau,  J.  (2015).  Hierarchical  neural  network 
generative models for movie dialogues. arXiv preprint 
arXiv:1507.04808, 7(8), 434-441.  
Vinyals,  O.,  Fortunato,  M.,  &  Jaitly,  N.  (2015).  Pointer 
networks. Advances in neural information processing